subject

Abstraction layer around subject data storage files

Classes:

Subject(name, dir, file, structure[, data, ...])

Class for managing one subject's data and protocol.

Functions:

_update_current(h5f)

Update the old 'current' filenode to the new Protocol Status

class Subject(name: typing.Optional[str] = None, dir: typing.Optional[pathlib.Path] = None, file: typing.Optional[pathlib.Path] = None, structure: autopilot.data.models.subject.Subject_Structure = Subject_Structure(info=H5F_Group(path='/info', title='Subject Biographical Information', filters=None, attrs=None, children=None), data=H5F_Group(path='/data', title='', filters=Filters(complevel=6, complib='blosc:lz4', shuffle=True, bitshuffle=False, fletcher32=False, least_significant_digit=None), attrs=None, children=None), protocol=H5F_Group(path='/protocol', title='Metadata for the currently assigned protocol', filters=None, attrs=None, children=None), history=H5F_Group(path='/history', title='', filters=None, attrs=None, children=[H5F_Group(path='/history/past_protocols', title='Past Protocol Files', filters=None, attrs=None, children=None), _Hash_Table(path='/history/hashes', title='Git commit hash history', filters=None, attrs=None, description=<class 'tables.description.Hashes'>, expectedrows=10000), _History_Table(path='/history/history', title='Change History', filters=None, attrs=None, description=<class 'tables.description.History'>, expectedrows=10000), _Weight_Table(path='/history/weights', title='Subject Weights', filters=None, attrs=None, description=<class 'tables.description.Weights'>, expectedrows=10000)])))[source]

Bases: object

Class for managing one subject’s data and protocol.

Creates a tables hdf5 file in prefs.get(‘DATADIR’) with the general structure:

/ root
|--- current (tables.filenode) storing the current task as serialized JSON
|--- data (group)
|    |--- task_name  (group)
|         |--- S##_step_name
|         |    |--- trial_data
|         |    |--- continuous_data
|         |--- ...
|--- history (group)
|    |--- hashes - history of git commit hashes
|    |--- history - history of changes: protocols assigned, params changed, etc.
|    |--- weights - history of pre and post-task weights
|    |--- past_protocols (group) - stash past protocol params on reassign
|         |--- date_protocol_name - tables.filenode of a previous protocol's params.
|         |--- ...
|--- info - group with biographical information as attributes
Variables
  • name (str) – Subject ID

  • file (str) – Path to hdf5 file - usually {prefs.get(‘DATADIR’)}/{self.name}.h5

  • current_trial (int) – number of current trial

  • running (bool) – Flag that signals whether the subject is currently running a task or not.

  • data_queue (queue.Queue) – Queue to dump data while running task

  • did_graduate (threading.Event) – Event used to signal if the subject has graduated the current step

Parameters
  • name (str) – subject ID

  • dir (str) – path where the .h5 file is located, if None, prefs.get(‘DATADIR’) is used

  • file (str) – load a subject from a filename. if None, ignored.

  • structure (Subject_Schema) – Structure to use with this subject.

Methods:

_h5f([lock])

Context manager for access to hdf5 file.

new(bio[, structure, data, attrs, children, ...])

Create a new subject file, make its structure, and populate its Biography .

update_history(type, name, value[, step])

Update the history table when changes are made to the subject's protocol.

_find_protocol(protocol[, protocol_name])

Resolve a protocol from a name, path, etc.

_make_protocol_structure(protocol_name, protocol)

Use a Protocol_Group to make the necessary tables for the given protocol.

assign_protocol(protocol[, step_n, ...])

Assign a protocol to the subject.

prepare_run()

Prepares the Subject object to receive data while running the task.

_data_thread(queue, trial_table_path, ...)

Thread that keeps hdf file open and receives data while task is running.

save_data(data)

Alternate and equivalent method of putting data in the queue as Subject.data_queue.put(data)

stop_run()

puts 'END' in the data_queue, which causes _data_thread() to end.

get_trial_data([step])

Get trial data from the current task.

_get_step_data(step[, groups])

Get individual step data, using the protocol group if given, otherwise try and recover from pytables description

_get_timestamp([simple])

Makes a timestamp.

get_weight([which, include_baseline])

Gets start and stop weights.

set_weight(date, col_name, new_value)

Updates an existing weight in the weight table.

update_weights([start, stop])

Store either a starting or stopping mass.

_graduate()

Increase the current step by one, unless it is the last step.

_update_structure()

Update old formats to new ones

Attributes:

info

Subject biographical information

bio

Subject biographical information (alias for info())

protocol

protocol_name

current_trial

session

step

task

session_uuid

history

hashes

weights

_h5f(lock: bool = True) tables.file.File[source]

Context manager for access to hdf5 file.

Parameters

lock (bool) – Lock the file while it is open, only use False for operations that are read-only: there should only ever be one write operation at a time.

Examples

with self._h5f as h5f:

# … do hdf5 stuff

Returns

function wrapped with contextmanager that will open the hdf file

property info: autopilot.data.models.biography.Biography

Subject biographical information

property bio: autopilot.data.models.biography.Biography

Subject biographical information (alias for info())

property protocol: Optional[autopilot.data.models.subject.Protocol_Status]
property protocol_name: str
property current_trial: int
property session: int
property step: int
property task: dict
property session_uuid: str
property history: autopilot.data.models.subject.History
property hashes: autopilot.data.models.subject.Hashes
property weights: autopilot.data.models.subject.Weights
classmethod new(bio: autopilot.data.models.biography.Biography, structure: typing.Optional[autopilot.data.models.subject.Subject_Structure] = Subject_Structure(info=H5F_Group(path='/info', title='Subject Biographical Information', filters=None, attrs=None, children=None), data=H5F_Group(path='/data', title='', filters=Filters(complevel=6, complib='blosc:lz4', shuffle=True, bitshuffle=False, fletcher32=False, least_significant_digit=None), attrs=None, children=None), protocol=H5F_Group(path='/protocol', title='Metadata for the currently assigned protocol', filters=None, attrs=None, children=None), history=H5F_Group(path='/history', title='', filters=None, attrs=None, children=[H5F_Group(path='/history/past_protocols', title='Past Protocol Files', filters=None, attrs=None, children=None), _Hash_Table(path='/history/hashes', title='Git commit hash history', filters=None, attrs=None, description=<class 'tables.description.Hashes'>, expectedrows=10000), _History_Table(path='/history/history', title='Change History', filters=None, attrs=None, description=<class 'tables.description.History'>, expectedrows=10000), _Weight_Table(path='/history/weights', title='Subject Weights', filters=None, attrs=None, description=<class 'tables.description.Weights'>, expectedrows=10000)])), path: typing.Optional[pathlib.Path] = None) autopilot.data.subject.Subject[source]

Create a new subject file, make its structure, and populate its Biography .

Parameters
  • bio (Biography) – A collection of biographical information about the subject! Stored as attributes within /info

  • structure (Optional[Subject_Structure]) – The structure of tables and groups to use when creating this Subject. Note: This is not currently saved with the subject file, so if using a nonstandard structure, it needs to be passed every time on init. Sorry!

  • path (Optional[pathlib.Path]) – Path of created file. If None, make a file within the DATADIR within the user directory (typically ~/autopilot/data) using the subject ID as the filename. (eg. ~/autopilot/data/{id}.h5)

Returns

Subject , Newly Created.

update_history(type, name: str, value: Any, step=None)[source]

Update the history table when changes are made to the subject’s protocol.

The current protocol is flushed to the past_protocols group and an updated filenode is created.

Note

This only updates the history table, and does not make the changes itself.

Parameters
  • type (str) – What type of change is being made? Can be one of

    • ‘param’ - a parameter of one task stage

    • ‘step’ - the step of the current protocol

    • ‘protocol’ - the whole protocol is being updated.

  • name (str) – the name of either the parameter being changed or the new protocol

  • value (str) – the value that the parameter or step is being changed to, or the protocol dictionary flattened to a string.

  • step (int) – When type is ‘param’, changes the parameter at a particular step, otherwise the current step is used.

_find_protocol(protocol: Union[pathlib.Path, str, List[dict]], protocol_name: Optional[str] = None) Tuple[str, List[dict]][source]

Resolve a protocol from a name, path, etc. into a list of dictionaries

Returns

tuple of (protocol_name, protocol)

_make_protocol_structure(protocol_name: str, protocol: List[dict])[source]

Use a Protocol_Group to make the necessary tables for the given protocol.

assign_protocol(protocol: Union[pathlib.Path, str, List[dict]], step_n: int = 0, protocol_name: Optional[str] = None)[source]

Assign a protocol to the subject.

If the subject has a currently assigned task, stashes it with stash_current()

Creates groups and tables according to the data descriptions in the task class being assigned. eg. as described in Task.TrialData.

Updates the history table.

Parameters
  • protocol (Path, str, dict) – the protocol to be assigned. Can be one of

    • the name of the protocol (its filename minus .json) if it is in prefs.get(‘PROTOCOLDIR’)

    • filename of the protocol (its filename with .json) if it is in the prefs.get(‘PROTOCOLDIR’)

    • the full path and filename of the protocol.

    • The protocol dictionary serialized to a string

    • the protocol as a list of dictionaries

  • step_n (int) – Which step is being assigned?

  • protocol_name (str) – If passing protocol as a dict, have to give a name to the protocol

prepare_run() dict[source]

Prepares the Subject object to receive data while running the task.

Gets information about current task, trial number, spawns Graduation object, spawns data_queue and calls _data_thread().

Returns

the parameters for the current step, with subject id, step number,

current trial, and session number included.

Return type

Dict

_data_thread(queue: queue.Queue, trial_table_path: str, continuous_group_path: str)[source]

Thread that keeps hdf file open and receives data while task is running.

receives data through queue as dictionaries. Data can be partial-trial data (eg. each phase of a trial) as long as the task returns a dict with ‘TRIAL_END’ as a key at the end of each trial.

each dict given to the queue should have the trial_num, and this method can properly store data without passing TRIAL_END if so. I recommend being explicit, however.

Checks graduation state at the end of each trial.

Parameters

queue (queue.Queue) – passed by prepare_run() and used by other objects to pass data to be stored.

save_data(data)[source]

Alternate and equivalent method of putting data in the queue as Subject.data_queue.put(data)

Parameters

data (dict) – trial data. each should have a ‘trial_num’, and a dictionary with key ‘TRIAL_END’ should be passed at the end of each trial.

stop_run()[source]

puts ‘END’ in the data_queue, which causes _data_thread() to end.

get_trial_data(step: Optional[Union[int, list, str]] = None) Union[List[pandas.core.frame.DataFrame], pandas.core.frame.DataFrame][source]

Get trial data from the current task.

Parameters

step (int, list, str, None) – Step that should be returned, can be one of

  • None: All steps (default)

  • -1: the current step

  • int: a single step

  • list: of step numbers or step names (excluding S##_)

  • string: the name of a step (excluding S##_)

Returns

DataFrame of requested steps’ trial data (or list of dataframes).

Return type

pandas.DataFrame

_get_step_data(step: int, groups: Optional[autopilot.data.models.protocol.Protocol_Group] = None) pandas.core.frame.DataFrame[source]

Get individual step data, using the protocol group if given, otherwise try and recover from pytables description

_get_timestamp(simple: bool = False) str[source]

Makes a timestamp.

Parameters

simple (bool) –

if True:

returns as format ‘%y%m%d-%H%M%S’, eg ‘190201-170811’

if False:

returns in isoformat, eg. ‘2019-02-01T17:08:02.058808’

Returns

basestring

get_weight(which='last', include_baseline=False)[source]

Gets start and stop weights.

Todo

add ability to get weights by session number, dates, and ranges.

Parameters
  • which (str) – if ‘last’, gets most recent weights. Otherwise returns all weights.

  • include_baseline (bool) – if True, includes baseline and minimum mass.

Returns

dict

set_weight(date, col_name, new_value)[source]

Updates an existing weight in the weight table.

Todo

Yes, i know this is bad. Merge with update_weights

Parameters
  • date (str) – date in the ‘simple’ format, %y%m%d-%H%M%S

  • col_name (‘start’, ‘stop’) – are we updating a pre-task or post-task weight?

  • new_value (float) – New mass.

update_weights(start=None, stop=None)[source]

Store either a starting or stopping mass.

start and stop can be passed simultaneously, start can be given in one call and stop in a later call, but stop should not be given before start.

Parameters
  • start (float) – Mass before running task in grams

  • stop (float) – Mass after running task in grams.

_graduate()[source]

Increase the current step by one, unless it is the last step.

_update_structure()[source]

Update old formats to new ones

_update_current(h5f) autopilot.data.models.subject.Protocol_Status[source]

Update the old ‘current’ filenode to the new Protocol Status