Selim Onat

I am a neuroscientist working currently on how humans make generalizations based on what they have previously learnt. To do so, I am using a variety of methodologies including fMRI (1), autonomous (2), as well as eye-movement recordings (3).

This research emanates from the well-established field of "stimulus generalization" following mainly the "lineage" of Hovland, Hull and Roger Shepard (4), and including the more recent computational work of Josua Tenenbaum (5). Furthermore, it integrates work on anxiety disorders, as it is believed that these mechanisms are impaired in people suffering from anxiety problems.

In the past, I have been working on how the nervous system processes natural scenes both at the electrophysiological and sensory-motor level. Since the times of Hubel and Wiesel, visual processing had been
overwhelmingly studied with artificial stimuli such as moving edges. However this type of stimuli suffer from an ecological validity problem, as they only rarely occur in real-life. We therefore investigated cortical processing during viewing of natural movies. This previous work focused on visual processing using mostly the technique of voltage-sensitive dye imaging and eye-tracking.

Reading EDF files in Matlab using Python

EDF2MatlabConversion

This article explores different ways for importing .EDF files recorded with an Eyelink tracker to Matlab using Python.

(1) Python package: Pyedfread

Most of the work for reading an EDF file has been done in the Pyedfread package written by Niklas Wilming. Most of the information presented here is also present in the README file at the Pyedfread project page.

The second thing is that, Python can save data in compliance with HDF format specifications, which is a file format designed to store large amounts of data.

The good news is that Matlab can read HDF5 format, as well. So this article might help you if you want to work in Matlab with EDF files, but do not have a nice way of importing the eyetracker data.

Once Pyedfread package is installed, the following should actually work.

In [18]:
from pyedfread import edf
edf_path = '/home/onat/Documents/Code/Python/JupyterNotebooks/EDF2MatlabConversion/data.edf';
samples, events, messages = edf.pread(edf_path,filter='TRIALID')
# samples, events, messages = edf.pread('/home/onat/data.edf')

Now these are imported as a pandas.DataFrame objects, similar to Matlab's Table objects. They print nicely when we simply call the variable name.

TRIALID is the name of the messages I was sending during the experiment as a string. That is now included in the DATAFRAME under the column trialid with the corresponding time stamp. This string contains all the information.

In [13]:
messages.head(5)
Out[13]:
SYNCTIME SYNCTIME_start py_trial_marker trialid trialid_time
0 18008463 18008463 0 TRIALID: 0001, PHASE: 0004, FILE: 0002, DELTAC... 18007408
1 18008463 18008463 0 TRIALID: 0001, PHASE: 0004, FILE: 0002, DELTAC... 18007408
2 18014465 18014465 1 TRIALID: 0002, PHASE: 0004, FILE: 0005, DELTAC... 18013422
3 18020479 18020479 2 TRIALID: 0003, PHASE: 0004, FILE: 0000, DELTAC... 18019525
4 18026483 18026483 3 TRIALID: 0004, PHASE: 0004, FILE: 0004, DELTAC... 18025557

And we can simply see what is contained in them by listing the names of columns.

In [14]:
messages.columns
Out[14]:
Index(['SYNCTIME', 'SYNCTIME_start', 'py_trial_marker', 'trialid ',
       'trialid_time'],
      dtype='object')

We see all the messages that have been sent to the eye-tracker computer. SYNCTIME message should contain stimulus onset.

As expected SAMPLES contains all the recorded samples with their time-stamps.

In [15]:
samples[samples.columns[0:4]].head(5)
Out[15]:
time px_left px_right py_left
0 18007486.0 -32768.0 -32768.0 -32768.0
1 18007487.0 -32768.0 -32768.0 -32768.0
2 18007488.0 -32768.0 -32768.0 -32768.0
3 18007489.0 -32768.0 -32768.0 -32768.0
4 18007490.0 -32768.0 -32768.0 -32768.0

Event channel contains all events that have been detected by the real-time parser. These includes blinks, button presses, fixations, saccades etc. The list below shows that there were 5078 events detected during the course of the experiment.

In [16]:
print(events.columns)
events.head(5)
Index(['ava', 'avel', 'blink', 'buttons', 'ena', 'end', 'eupd_x', 'evel',
       'eye', 'gavx', 'gavy', 'genx', 'geny', 'gstx', 'gsty', 'havx', 'havy',
       'henx', 'heny', 'hstx', 'hsty', 'message', 'pvel', 'sta', 'start',
       'supd_x', 'svel', 'time', 'trial', 'type'],
      dtype='object')
Out[16]:
ava avel blink buttons ena end eupd_x evel eye gavx ... hsty message pvel sta start supd_x svel time trial type
0 4893.0 5.000000 False 0 4901.0 18007616 36.299999 5.200000 1 646.200012 ... -2096.0 23.400000 4888.0 18007521 36.299999 23.400000 0 0 fixation
1 0.0 246.199997 False 0 0.0 18007707 38.799999 19.299999 1 0.000000 ... -2041.0 539.000000 0.0 18007617 36.299999 7.000000 0 0 saccade
2 4876.0 5.600000 False 0 4857.0 18007823 38.799999 1.700000 1 1341.699951 ... 610.0 20.000000 4887.0 18007708 38.799999 20.000000 0 0 fixation
3 0.0 63.599998 False 0 0.0 18007858 39.200001 24.299999 1 0.000000 ... 636.0 164.000000 0.0 18007824 38.799999 1.200000 0 0 saccade
4 4865.0 4.700000 False 0 4893.0 18008012 39.200001 4.200000 1 1383.000000 ... 825.0 27.799999 4859.0 18007859 39.200001 27.200001 0 0 fixation

5 rows × 30 columns

Now we would like to save these 3 output variables in a format that Matlab can read.

To do that let's re-read the data without excluding samples. And save messages, events and samples using the HDF format.

TO_HDF is a method from Panda.DataFrame. Using the append mode (mode='a') we add all the three datasets to one single HDF file.

In [20]:
samples, events, messages = edf.pread(edf_path,filter='SYNCTIME')
edf.save_human_understandable(samples,events,messages,'/home/onat/Documents/Code/Python/JupyterNotebooks/EDF2MatlabConversion/data.hdf')

We can then import the HDF file in Matlab using the H5X functions. For example,

>> info = h5info('/home/onat/data.hdf');
>> info.Groups(3)
ans = 
          Name: '/samples'
        Groups: []
      Datasets: [40x1 struct]
     Datatypes: []
         Links: []
    Attributes: []

INFO will now contain 3 groups of dataset containing samples, events and messages.

To plot, for example the 12th trial, we would do something like the following.

%onset of the 12th stimulus
>> onsets              = h5read('test2.hdf', '/messages/SYNCTIME');
>> current_trial       = 12;
>> current_trial_onset = double(onsets(current_trial));%time of onset.
% Gaze x, y coordinates and pupil size
>> X             = h5read('test2.hdf', '/samples/gx_right');
>> Y             = h5read('test2.hdf', '/samples/gy_right');
>> P             = h5read('test2.hdf', '/samples/pa_right');
>> time          = h5read('test2.hdf', '/samples/time');

% Find the sample where CURRENT_TRIAL is ON.
>> [~,i]         = min(abs(time-current_trial_onset));
% Indices for this trial assuming a duration of ~1500ms
>> I             = i:i+1498;
>> T             = (time(I)-time(I(1)))/1000;
>> Y             = Y(I);
>> X             = X(I);
>> P             = P(I);
% Blinks are coded as 10^8, NaNize these samples.
>> invalid       = (Y == 100000000);
>> X(invalid)    = NaN;
>> Y(invalid)    = NaN;
>> P(invalid)    = NaN;
>> T(invalid)    = NaN;
%
>> figure(122)
>> plot(T,nanzscore(X),'go-',T,nanzscore(Y),'ro-',T,nanzscore(P),'mo-');

This should return the following figure, where the data corresponding to the blinks are NANed. figure

CONCLUSION

If you have fluid transitions between Matlab and Python, this strategy would be relevant. However, things get easily complicated on the Matlab side, and clearly some house keeping routines have to be coded.