Ctrax logo

BehavioralMicroarray Usage

BehavioralMicroarray Matlab Toolbox

The BehavioralMicroarray Toolbox for Matlab contains a number of functions for analyzing the behaviors of the walking flies whose trajectories are computed using Ctrax. The toolbox includes code for computing per-frame statistics such as speed, distance to closest fly, etc., code for learning behavior classifications, code for detecting learned behaviors in the trajectories, code for histogramming various statistics of behavior, and code for analyzing and comparing the distributions of behavioral statistics for different types of flies. The analyses possible are similar to those performed in our paper, "High-Throughput Ethomics in Large Groups of Drosophila".


Contents


showtrx.m

Screenshot of showtrx.m

This script allows you to view a movie annotated with the trajectories computed by Ctrax. First, input the movie and corresponding trajectory MAT-file. This movie is shown in the main window of the GUI that is started. You can scroll to different frames or play the movie. You can select any number of flies and play the movie keeping the axes zoomed on these flies only. Per-frame properties of the last-clicked fly are also shown in the "Selected Flies" panel. More per-frame statistics can be computed by clicking the "Compute Per-Frame Stats" button, which calls the compute_perframe_stats.m script.


simple_diagnostics.m

This script inputs the trajectories of flies in a video and outputs a few simple visualizations of the data. In figure 1, it plots the (x,y) position of each fly in a separate subplot. In figure 2, it plots a histogram of the position in the arena of all flies (frequency in the left subplot, log frequency in the right subplot). In figure 3, it plots a histogram of the speed of all the flies in black as well as each fly individually in color. In figure 4, it plots a histogram of the angular speed (change in orientation) of all the flies in black as well as each fly individually in color.

Screenshot of 
simple_diagnostics.m

make_ctrax_result_movie.m

25 female and 25 male flies in 24.5-cm diameter open arena.

This function makes an AVI out of the raw video of the flies in the arena and the trajectories returned by Ctrax/FixErrors like those shown in Arena and Video Requirements. You will be prompted for the raw video to input, the name of the MAT-file containing the trajectories, and the name of the AVI file to output the annotated movie to. You can then set which frames to output (first frame and number of frames to output), the video frames per second, the compressor (Windows only), and the number and size of the zoomed-in fly boxes to show in the right panel of the movie. Uncompressed videos exported by this function can easily be compressed by any of the programs listed under Ctrax: File->Export as AVI.


load_tracks.m

Usage: [trx,matname,succeeded] = load_tracks

This function loads the trajectories output to a MAT-file by Ctrax into the easier-to-use structure trx. The trx structure is the representation of the trajectories used throughout this toolbox. It prompts you for a MAT-file containing trajectories -- this can either be a MAT-file output by Ctrax or any .mat file containing the trx variable: the output of load_tracks.m, convert_units.m, compute_perframe_stats.m, compute_perframe_stats_social.m, classify_by_area.m. If the input MAT-file is the raw output from Ctrax, the function stores trx to the auto-created file defined by the returned value matname. The returned value succeeded reflects whether trajectories were successfully loaded or not. trx is an array of structs with an element for each fly (i.e., length(trx) is the number of flies). trx(fly) has the following member variables.


convert_units.m

This script allows you to convert from the video units of pixels and frames to the real units of millimeters and seconds.

  1. Select a MAT-file containing the flies' trajectories to convert.
  2. Set the number of frames per second.
  3. Set the conversion from pixels to millimeters manually or compute this conversion by entering the known distance in millimeters between two landmark points.
    1. In the "Compute" option, you will first select the movie file corresponding to the trajectories. The first frame of this movie is then read in and displayed.
    2. Draw a line between two landmark points.
    3. Set the length of this line in millimeters.
    4. Select a MAT-file to save the results to.

The resulting trx structure (see load_tracks.m) is augmented with the following member variables:


compute_perframe_stats.m

This script inputs the trajectories of flies in a video and outputs a number of per-frame properties of interest. These include per-frame speed and acceleration properties such as velocity, angular velocity, forward and sideways velocity, velocity direction, change in velocity direction, velocity of the nose and tail of the fly, center of rotation, speed of center of rotation, and acceleration. In addition, there is the option to compute arena-based properties for circular arenas, such as distance to wall, orientation relative to the wall, and change in distance to the wall. Finally, there is the option to compute per-frame properties involving other flies, such as distance to the closest fly (center-center, nose-ellipse), maximum angle of the fly's vision subtended by another fly, velocity toward the closest fly, orientation and velocity direction relative to closest fly, and change in any of these properties. More specifically, the following per-frame properties are always computed. All of these fields are appended to the trx structure.

Basic parameters:

Circular arena parameters:

Closest fly parameters:


compute_perframe_stats_social.m

Like compute_perframe_stats.m, this script inputs the trajectories of flies in a video and outputs a number of per-frame properties of interest. These per-frame properties are functions of the trajectories of each pair of flies. Because there will be a trajectory computed for each pair of flies, the script saves the results to multiple MAT-files. Each MAT-file corresponds to the main fly fly1 being a different fly, and the saved trajectory in pairtrx(fly2) corresponds to the pair (fly1,fly2). The names of each MAT-file are autogenerated, created as the original mat name appended with "_perframepairs_fly``[fly1]``_start``[t0]``_end``[t1]``", with [fly1] standing in for the index of the main fly, [t0] standing in for the first frame of the pair trajectory and [t1] standing in for the last frame of the pair trajectory. For example, if fly1 = 5, t0 = 1, and t1 = 1000, then the script will append the MAT-file's name with "_perframepairs_fly05_start00001_end01000". The per-frame pairwise parameters computed are:


classify_by_area.m

Screenshot of classify_by_area.m

This script allows classifies the types of flies in a movie based on their areas. For example, it can be used to classify the sex of flies, as males are smaller than females.

  1. The script first prompts for MAT-files containing the trajectories of the flies. Note that you can enter multiple files to classify the types of the flies in more than one movie simultaneously.
  2. If an area-based classifier has already been defined during a previous run of this script, it can be loaded in at this step by choosing the saved MAT-file containing the parameters of this area-based classifier.
  3. The measured image area of the fly will often depend on its location in the arena, in particular because of lighting differences. There is the option to normalize for this effect by learning the quadratic regression that best predicts the area of the fly given its location in the arena. For trajectory MAT-files which may contain more than one type of fly, the script actually predicts the ratio of the area in a particular frame to the median area for a fly over the entire trajectory. Better performance may be achieved by using homogeneous trajectory files -- files where all flies are approximately the same size. Then, the algorithm can predict the ratio of per-frame area to the median area over all frames and flies. You can load in more trajectory MAT-files to optimize these normalization parameters, including homogeneous trajectory files, at this step. You can specify whether the extra MAT-files loaded in are homogeneous or not. If they are, then the normalization terms are estimated only from these homogeneous MAT-files, otherwise all MAT-files are used. Also, if the normalization parameters have already been estimated, there is an option to load these in from a saved MAT-file. After the normalization parameters have been estimated, a figure will pop up showing the normalization function learned -- each pixel in the image encodes the fit ratio of area to median area for that location in the image.
  4. The script then brings up a figure showing the (possibly normalized) median for each area fly over its entire trajectory. The fly identities (x-axis) are sorted in order of increasing area (y-axis). The color of the plotted point reflects which MAT-file the fly is from and a key is given in the legend. For reference, gray lines give the 5th, 25th, 75th, and 95th percentile area for the fly over its trajectory. The red cross-hairs show the current estimate of the threshold between the two types of flies. They can be dragged to move the threshold. To select the currently displayed threshold, hit ENTER.
  5. Now define the name of the type variable, as well as the names of the two types of flies. For instance, the type variable might be defined to be "sex", and the smaller type might be defined to be "M" (for male) and the larger type might be defined to be "F" (for female). In this case, for each fly, the field sex would be added to the trx variable, and its value will be set to either 'M' or 'F', depending on the fly's median area.
  6. The updated trx variable can then be saved to a MAT-file. The script prompts for the names of these files.
  7. The parameters of the area-based classifier can then be saved to a MAT-file for future use.

learn_params.m [deprecated]

Note that learn_params.m has been superceded by JAABA. It still exists in the Behavioral Microarray Toolbox, but its use is no longer encouraged.

This script allows you to label the starts and ends of episodes of a single behavior. It then learns the threshold-based classifier that best replicates these labels.

  1. Choose whether to learn a behavior classifier from pre-labeled data (i.e. whether you have already labeled data or want to label new data).
  2. To label data where the fly is performing and not performing the behavior:
    Screenshot of labelbehaviors.m

    Screenshot of labelbehaviors.m function

    1. Choose corresponding pairs of movies and MAT-files containing per-frame properties (see compute_perframe_stats.m).
    2. Choose the number of flies to label in each movie, and the number of frames to label for each fly labeled. Flies and intervals of frames to label are selected randomly from each movie.
    3. Choose the file to save the labels to. If you want to learn parameters again from this labeled data using different settings, you can load this file instead of relabeling.
    4. A GUI will then open, allowing you to set the start and ends of sequences in which the fly is performing the behavior.
    5. The frame of the video currently shown can be moved using the frame slider or by clicking on the fly's trajectory.
    6. To add a sequence in which the fly is performing the behavior, go to the start of the sequence and hit the "Add Start" button. This will highlight the segment starting at the current frame and going to the end of the trajectory (or until the start of the next segment).
    7. To set the end of this segment, go to the last frame of the behavior, and hit the "Add End" button.
    8. Labeled segments can be removed with the "Delete" button.
    9. You can zoom in and out using the figure toolbar. The small window in the top right shows what part of the video is shown in the larger main window.
    10. In the bottom panel below the main window, per-frame properties of the fly are shown. du_ctr is the forward velocity of the fly's center (relative to the fly's orientation, in pixels/frame), dv_ctr is the sideways velocity of the fly's center of rotation, and dtheta is the change in orientation (degrees/frame). cor is the center of rotation along the fly's major axis. The center of rotation is defined as the point on the fly that translates the least from one frame to the next, and cor is this point projected onto the major axis (varies between -1 corresponding to the tail and 1 corresponding to the nose). du_cor and dv_cor are the forward and sideways velocity of the fly's center of rotation.
    11. The color of the plotted point is related to the fly's speed [for Matlab experts, the "Scatter Command" can be changed to any valid Matlab command to change the property encoding the trajectory's color].
    12. When done labeling a fly and its trajectory interval, close the window to go to the next fly/movie/finish.
  3. Choose a MAT-file to save the learned parameters to.
    Screenshot of 
chooseproperties.m

    Screenshot of chooseproperties.m function

  4. Choose properties of the behavior classifier to learn. The behavior classifier is based on four thresholds. The trajectory from from time t1 to time t2 can be classified as the behavior if:
    1. the per-frame bounds hold: in each frame, given properties are within given loose ranges,
    2. the near-frame bounds hold: there is a frame within r frames from each frame t such that given properties are within given tighter ranges,
    3. the sequence sum bounds: the total summed value of given properties over the entire sequence must be within given bounds, and
    4. the sequence mean bounds: the mean value of a given property over the entire sequence must be within given bounds.
    Within the "chooseproperties" dialog, you can select which properties are used for each of the four types of bounds. The exact parameter ranges will be learned automatically later, but you can also set sensible bounds on the values that these ranges can take. For instance, if the property cannot be negative, it is useful to bound the ranges at >=0.
  5. Now that all the parameters have been set, the software will automatically try to find the ranges of the bounds that result in the segmentation of the training data closest to the labels provided. Figure 1234 shows the current "Score" of the classifiers considered (higher is better). You can hit "Quit" at any time to use the best classifier found so far, or wait until the search converges. Properties of the current distribution of the classifier parameters and the types of classification errors made on the training data are printed out at fixed intervals.

    The "Score" is a simple function of the errors relative to the classifier, summed over the whole movie. Each type of error is assigned a weight, and the error for the movie is the weighted sum of the errors of each type. Each error itself has both a constant and a linear cost, with the linear portion weighted by the duration of the erroneous behavior.

  6. The learned parameters have been saved to the file selected in step 3. These can be input to a behavior detector.
  7. To illustrate the differences between the detected and labeled behaviors, the script will plot the fly's position in and around frames where either the behavior is labeled or detected. The fly's position is plotted in black. Labeled behaviors are plotted in magenta and detected behaviors in cyan. At the start and end of each detected behavior, green and red circles are shown, respectively. There is a subplot for each labeled "fly". Nothing is plotted in cases where no behaviors are labeled or detected.
  8. We finally print out the parameters of the classifier.
  9. Screenshot of learn_params.m output.

    Screenshot of learn_params.m output.


learn_params_social.m [deprecated]

Note that learn_params_social.m has been superceded by JAABA. It still exists in the Behavioral Microarray Toolbox, but its use is no longer encouraged.

Similar to the learn_params.m script, learn_params_social.m allows a user to label the starts and ends of episodes of a single social behavior -- a behavior involving a pair of flies. It then learns the threshold-based classifier that operates on a pair of fly trajectories and best replicates these labels. The following directions parallel the directions in learn_params.m.

  1. Choose whether to learn a behavior classifier from pre-labeled data (i.e. whether you have already labeled data or want to label new data).
  2. To label data where the flies are performing and not performing the behavior:
    Screenshot of 
labelbehaviorssocial.m

    Screenshot of labelbehaviorssocial.m function.

    1. Choose corresponding pairs of movies and MAT-files containing per-frame properties (see compute_perframe_stats.m).
    2. Choose the number of flies to label of each sex in each movie, and the number of frames to label for each fly labeled (the "sex" property should have previously been set using the classify_by_area.m script). Flies and intervals of frames to label are selected randomly from each movie.
    3. Choose the file to save the labels to. If you want to learn parameters again from this labeled data using different settings, you can load this file instead of relabeling.
    4. A GUI will then open, allowing you to set the start and ends of behavior sequences and the other fly involved.
    5. The frame of the video currently shown can be moved using the frame slider or by clicking on the fly's trajectory.
    6. The currently selected other fly is highlighted and a white line is drawn from the main fly to the other fly.
    7. To add a sequence in which the fly is performing the behavior, select the correct "other fly", go to the start of the sequence, and hit the "Add Start" button. This will highlight the segment starting at the current frame and going to the end of the trajectory (or until the start of the next segment). A colored line will connect the other fly's initial position to the labeled trajectory sequence.
    8. To set the end of this segment, go to the last frame of the behavior, and hit the "Add End" button.
    9. Labeled segments can be removed with the "Delete" button.
    10. You can zoom in and out using the figure toolbar. The small window in the top right shows what part of the video is shown in the larger main window.
    11. In the bottom panel below the main window, per-frame properties of the fly are shown. du_ctr is the forward velocity of the fly's center (relative to the fly's orientation, in pixels/frame), dv_ctr is the sideways velocity of the fly's center of rotation, and dtheta is the change in orientation (degrees/frame). cor is the center of rotation along the fly's major axis. The center of rotation is defined as the point on the fly that translates the least from one frame to the next, and cor is this point projected onto the major axis (varies between -1 corresponding to the tail and 1 corresponding to the nose). du_cor and dv_cor are the forward and sideways velocity of the fly's center of rotation.
    12. The color of the plotted point is related to the distance to the closest fly [for Matlab experts, the "Scatter Command" can be changed to any valid Matlab command to change the property encoding the trajectory's color].
    13. When done labeling a fly and its trajectory interval, close the window to go to the next fly/movie/finish.
  3. Choose a MAT-file to save the learned parameters to.
  4. Choose properties of the behavior classifier to learn. The behavior classifier is based on four thresholds. The trajectory from from time t1 to time t2 can be classified as the behavior if:
    1. the per-frame bounds hold: in each frame, given properties are within given loose ranges,
    2. the near-frame bounds hold: there is a frame within r frames from each frame t such that given properties are within given tighter ranges,
    3. the sequence sum bounds: the total summed value of given properties over the entire sequence must be within given bounds, and
    4. the sequence mean bounds: the mean value of a given property over the entire sequence must be within given bounds.
    Within the "chooseproperties" dialog, you can select which properties are used for each of the four types of bounds. The exact parameter ranges will be learned automatically later, but you can also set sensible bounds on the values that these ranges can take. For instance, if the property cannot be negative, it is useful to bound the ranges at >=0.
  5. Now that all the parameters have been set, the software will automatically try to find the ranges of the bounds that result in the segmentation of the training data closest to the labels provided. Figure 1234 shows the current "Score" of the classifiers considered (higher is better). You can hit "Quit" at any time to use the best classifier found so far, or wait until the search converges. Properties of the current distribution of the classifier parameters and the types of classification errors made on the training data are printed out at fixed intervals.

    The "Score" is a simple function of the errors relative to the classifier, summed over the whole movie. Each type of error is assigned a weight, and the error for the movie is the weighted sum of the errors of each type. Each error itself has both a constant and a linear cost, with the linear portion weighted by the duration of the erroneous behavior.

  6. The learned parameters have been saved to the file selected in step 3. These can be input to a behavior detector.
  7. To illustrate the differences between the detected and labeled behaviors, the fly's position is plotted in and around frames where either the behavior is labeled or detected. The main fly is plotted in black; the other fly is plotted in green. Corresponding frames are connected in yellow. Labeled behaviors are shown in magenta and detected behaviors in cyan. At the start and end of each detected behavior, green and red circles are shown, respectively. There is a subplot for each labeled fly. Nothing is plotted in cases where no behaviors are labeled or detected.
  8. The parameters of the classifier are output.
  9. Screenshot of 
learn_params_social.m output.

    Screenshot of learn_params_social.m output.


detect_behaviors.m

This script uses a defined behavior classifier to segment trajectories of each fly (or pair of flies, for social behaviors) into sequences in which the fly is and is not performing the given behavior.

  1. Prompt for the names of MAT-file(s) containing the trajectories of the flies. These MAT-files should contain all the per-frame properties necessary for the behavior detection (see compute_perframe_stats.m and compute_perframe_stats_social.m).
  2. Prompt for the MAT-file defining the behavior to be classified. This is the output of either learn_params.m or learn_params_social.m.
  3. If there are missing per-frame parameters in the read-in trajectories, the script will try to figure out which functions it must call to compute these.
  4. The script then segments each trajectory and prompts for the name of the MAT-file to save the results to. In this MAT-file, it saves the array of structs seg, where seg(fly) is the segmentation for fly fly. seg has the fields t1 and t2, which define the start and end frames of intervals in which the fly is performing the behavior.
  5. The segmented trajectories are also plotted in a manner similar to learn_params.m and learn_params_social.m. There is a subplot for each fly. The fly's position in each frame is plotted in black. Frames in which the behavior is detected are plotted in magenta. At the start and end of each detected behavior, green and red circles, are shown, respectively. In the case of social behaviors, during and near detected behaviors, the the other fly's position is plotted in green. orresponding frames for the main and other fly are connected in yellow.
Screenshot of 
detect_behaviors.m output.

Screenshot of output of detect_behaviors.m.


histogramproperties.m

Screenshot of 
histogramproperties.m for one property

Screenshot of histogramproperties.m for one property over four different data types.

The histogramproperties GUI allows you to make histograms of single and pairs of per-frame properties. When the GUI is started, you will be prompted for a MAT-file containing the trajectories with per-frame properties. You can then interact with the GUI to explore statistics of the per-frame properties for the flies in the trajectories. The "No. Properties" radio buttons allow you to specify whether you want to histogram a single per-frame property or jointly histogram a pair of per-frame properties. In the former case, the histogram will consist of the property plotted against frequency. In the latter case, a 2-D histogram will be computed and displayed as an image, the color of the image reflecting the frequency. The "Property 1" and "Property 2" (if two properties are histogrammed) panels allow you to specify which properties to histogram. The pop-up menu displays the name of the per-frame property (the actual name of the field of the "trx" struct). You can also specify properties of the histogram in this panel -- the number of bins to histogram into and the range of the data to histogram, either in percent or in absolute units. Finally, you can specify whether a transformation should be applied to the per-frame data before histogramming. "None (Identity)" specifies that no transformation should be applied, "Absolute value" specifies that the absolute value of the data should be taken, and "Log absolute value" specifies that the log of the absolute value should be taken. If the loaded trajectories do not have sufficient per-frame statistics to explore, clicking the "Compute Per-Frame Stats" button calls compute_perframe_stats.m on the current trajectories.

You can compare histograms of different portions of the data. The Data panel defines different histograms to compute. You can specify a behavior segmentation file (computed using detect_behaviors.m), and look at the per-frame properties only during ("During behavior...") or only not during ("Invert behavior") the specified behavior. You can look at the histogram of the properties during all frames of the behavior or condense each sequence of a behavior into a single statistic and then histogram these statistics ("Interval Averaging"). "None (Per-frame)" specifies that all frames should be used, i.e., there is no summarizing of each sequence, and instead the per-frame properties from all intervals are just concatenated together. "Interval mean" specifies that the mean per-frame property for each interval should be histogrammed. "Interval median" indicates that the median per-frame property for each interval should be histogrammed. "Interval start" indicates that the per-frame property in the first frame of the interval should be histogrammed. "Interval end" indicates that the per-frame property in the last frame of the interval should be histogrammed. If the type or sex fields of the trx struct have been set to reflect the type or sex of each fly (e.g., with the classify_by_area.m script), you can also separate the flies by type/sex ("Fly Type(s)"). You can set the name of this data type in the "Data Name" field.

Properties of the plot can be set in the "Plot Parameters" panel. First, you can set precisely what histogram statistic is plotted ("Plot Statistic"). "Total Count" is the total frequency over all flies for each bin. "Total Fraction" is the total fraction (frequency normalized to sum to 1) over all flies. "Mean Fraction per Fly" computes the fraction histogram for each fly, then averages the histograms. "Mean Count per Fly" computes the frequency histogram for each fly, then averages the histograms. Next, you can set whether the standard deviation or standard error should be plotted ("Plot Error Bars"). Standard deviations/errors are computed over flies. If one selects "Plot Individual Flies", the histograms for each individual fly will also be plotted, in addition to the population summaries. If "Plot log of counts/fraction" is checked, then the log frequencies/fractions are plotted.

Screenshot of 
histogramproperties.m for two properties

Screenshot of histogramproperties.m for two properties and one data type.

Clicking the "Update Plot" button will update the plotted histogram to reflect any changes to the parameters made. Clicking the "Export..." button allows you to export the computed histograms to a selected MAT-file. The following variables are exported:


plot_behaviormicroarray.m

This script allows you to explore differences in the behavioral statistics of different types of flies. When the GUI is started, you will first be prompted to input the name(s) of the trajectory files already augmented with the per-frame properties. You can then interact with the GUI to plot and compare various behavioral statistics for different populations of flies.

Screenshot of plotbehaviormicroarray.m parameter settings.