Department of Biology, 216-76
California Institute of Technology, Pasadena, CA 91125
Neuroethological experiments frequently require several different types of data to be acquired, synchronized with each other, and stored. For example, an experiment may generate physiological data stored on analog tape and/or digitized to computer files, protocols stored in notebooks, tape recordings of experimenter's comments, photographs, etc. In addition, there are many instances when it is desirable to display on a single monitor, or record on a single videotape, the synchronized outputs of two video cameras. For example, in animal tracking experiments, the simultaneous views of two camera angles are necessary to determine positions in 3-dimensions.
During data analysis, synchronizing data stored in different media or on separate videotapes is often problematic, especially if the required temporal precision is substantially less than one second, or if the periods of experimental interest are sparse in long video recording sessions. For such analysis, it would be convenient to record on a single videotape information about both the subject and the state of the experimental apparatus, and images from multiple cameras. In this paper, we describe methods for mixing computer and video images, and for mixing images from two video cameras into a single video signal. We present three applications of this technology to our neurophysiological and behavioral research (Rasnow et al., 1993; Hartmann and Bower, 1996).
We first describe how the Apple Macintosh 660AV and 6100AV families of multimedia computers support live video, and in particular, how video can be viewed, annotated, and analyzed within virtually any commercial software package. These methods work even with programs that are not explicitly designed to support video, such as MATLAB ( The Mathworks Inc., Natick, MA) and Illustrator ( Adobe Systems Inc., Mountain View, CA). We have focused on these particular computers because they support the methods described here with stock hardware and require no programming. It may be considerably more difficult to implement these methods on other computer models and platforms.
We next describe a novel method for mixing the outputs of two video cameras into a single video image, which can then be recorded on standard videotape for later review and analysis. This simple, inexpensive circuit works equally well with PAL and NTSC video, and it also provides an output for running a field counter useful for data synchronization and time coding of the videotape.
Finally, we describe our use of video in three specific experimental applications: 1. We used chroma keying to mix computer and video images to create high-resolution maps of the electric potentials generated by weakly electric fish; 2. We used the analog video mixer to synchronize neurophysiological recordings with ongoing behaviors of the freely-moving rat; and 3. We combined chroma keying and the analog video mixer to study the electric fish's behavioral strategies for electrolocation. Although we present these multimedia techniques in the context of our own research, they are likely to have many other applications in experimental neuroscience.
A common use of multimedia computers is digitizing video frames and storing them to disk, or "frame grabbing." However, frame grabbing places high demands on many of the computerÕs subsystems. Since each full-resolution video frame contains over 300,000 pixels, digitally storing uncompressed video requires on the order of 10 million bytes per second to be transferred to disk. Personal multimedia computers are thus often seriously limited in their frame capture rate, resolution, and storage capacity, and usually record video at lower quality than inexpensive VCRs. In the applications described here, instead of frame grabbing, we viewed video on the computer screen, and recorded the combined computer screen and video input to analog videotape (e.g., Fig. 2B). The same computer simultaneously controlled the experimental apparatus, which included demanding tasks such as unbuffered data acquisition at 288,000 16-bit samples per second (Rasnow, 1994).
It is also possible to add video capabilities to non-multimedia computers by installing plug-in cards. For example, we first experimented with multimedia techniques using a TV Producer card (Computer Friends Inc., Portland, OR) in an Apple Macintosh II computer. The requisite capabilities of a video card for our applications include video input and output, and chroma keying.
Chroma keying is just one example of "alpha mixing," where each output value is computed by a weighted average of the source values (Jack, 1993):
output pixel = a * computer pixel + (1- a) * video pixel.
For chroma keying, the weights (a, 1-a) are either zero or one, depending on the color of the computer's pixel. Another alpha mixing rule called "video transparency" is achieved by setting the weights to constants between zero and one. This results in both computer and video source images being visible and semi-transparent, similar to "cross fading" transitions in cinema. Since both computer and video source images are attenuated by their respective weights, the contrast and saturation of each are proportionately reduced in the output image. This inherent image degradation with video transparency has made chroma keying the more frequent choice in our research.
Alpha mixing works with most computer programs, including those not explicitly designed to support video, because the computer and video images are combined low in the computer's display hierarchy, at the level of the screen buffer. Most computer programs maintain their graphical and display environments at more abstract or higher levels. If the screen buffer is configured for transparency or chroma keying, then parts of a program's display area may contain live video, without the program being "aware" of it. In other words, the program's function and operation are unaffected, regardless of whether video is visible within its windows.
Transparent video is available with a public domain System Extension from Apple Computer called "AV Digitizer Options." Installing this file adds an "AV Options" pop-up menu in video monitoring software (such as Apple's "Video Monitor"). Selecting this menu displays a window with controls to set the weights (a, 1- a) from 8 values between zero and one. Every pixel of the display is alpha mixed with the video. Therefore, video can be visible within any application window located over the video monitor window.
Our method for chroma keying takes advantage of a peculiarity of these computer's screen buffer implementation. With the computer displaying 256 colors, one of the colors is always "key" or 100% transparent to video. We find this color by displaying the palette of all 256 colors over the video window, and looking for a missing color filled with the background video and surrounded by the other colors. We then fill the desired parts of the application window with this key color to view the underlying video. For example, from NIH Image 1.29 software ( http://rsb.info.nih.gov/nih-image), we select the key color using the Color Picker tool with the "CLUT" (color lookup table) window positioned over the video, and fill the main window using Image's Paint Bucket tool. Chroma keying with MATLAB 4.2 is described in Appendix 1.
At the heart of the design is a MAX453 video amplifier and 2 channel multiplexer. This chip, by itself, is sufficient to mix two genlocked video signals; the remainder of the circuit is devoted to telling the MAX453 precisely when to switch between the two camera inputs. The composite video output of camera A feeds the external sync input of camera B and also the LM1881 video sync separator, which extracts vertical and horizontal sync pulses. The monostable multivibrators (74HC123) are triggered by these pulses and select the portions of each horizontal line from Camera A that will be replaced with Camera B in the mixed image. The pulse time of the multivibrators is set by adjusting the potentiometers, which determine the size and position of the Camera B window in the mixed image. Each potentiometer adjusts one side of the window (left, right, top, bottom). The timing information from the horizontal and vertical multivibrators controls when the MAX453 multiplexer's output will switch between Camera A and Camera B. The MAX453 output can drive a 75 ohm load directly, and thus can be sent directly to a standard VCR and/or TV monitor. Vertical sync pulses from the LM1881 (pin 5) can be used to trigger a field counter LED display.
Figure 1. Schematic diagram of the video multiplexer. The circuit in the top left is a voltage regulator providing +-5 volts for the mixer circuit below.
We achieved these goals by imaging the immobilized fish and electrode array with a stationary video camera, and displaying the live video on the computer monitor using chroma keying and NIH Image software (Fig. 2). We first traced the body profile with dots in a non-key color using the Measure Coordinates tool. If the fish moved during the experiment, these dots would become misaligned with the video image and be immediately apparent. We could then return the fish to its former position by realigning the video and body outline images. The body coordinates were also saved to a file and used to draw the body in subsequent figures and data analysis.
Figure 2. A. Schematic of the apparatus for mapping the electric organ discharge of the weakly electric fish. The computer display is chroma keyed with video of the fish using a TV Producer card, and recorded on videotape. B.
Photograph from videotape playback of the experiment. The NCSA Image window labeled "map3.pict" was filled with the key color, revealing live video of the immobilized fish (Gymnotus sp.) and electrode array (far left). Prior electrode recording sites were drawn as dots in another color, which "float" over the video. The fish's body was also outlined with dots in a different color. MATLAB 3.5, running in the foreground, displays waveforms sampled from the electrode array, and commands and diagnostics in its command window (lower left).
The fish's electric potential was measured with a flexible array of five electrodes pressed against the body. We measured each electrode position by clicking with Image's Measure Coordinates tool over the electrode tips, seen in the chroma keyed video. Note that the 500 µm diameter electrode tips were generally too small to locate in single video frames. However, the lower noise and greater apparent resolution and contrast of live video allowed us to locate them. After measuring each tip position (which left an opaque dot floating over the video image), we switched applications from Image to MATLAB, and executed a script to digitize and record the amplified electrode signals with custom hardware and software (Rasnow, 1994). We then monitored on the computer screen the movement of the electrode array to its next location.
As the experiment progressed, the fish's body became covered with dots indicating where waveforms had already been recorded, thereby permitting efficient and non- redundant sampling at additional points covering the fish's body (Fig. 2B). Upon completion of data acquisition, the waveforms recorded in MATLAB had to be combined with their respective locations, recorded in Image, to generate an electric field map. If there was doubt about the correspondence between these files (e.g., if there was an extra data point in the locations or waveforms files), it was a simple matter to review the experiment videotape, which contained a complete and synchronized record of all computer and video activity (along with two audio channels containing the experimenter's comments and the fish's electric organ discharge frequency). After additional digital signal processing in MATLAB, the waveforms were interpolated in space and displayed as pseudocolor movies of the fish's electric organ discharge. Interested readers can see the resulting Quicktime movies on our web pages ( http://www.bbb.caltech.edu/ElectricFish/).
As shown in Figure 3, we used the mixer to combine images from two video cameras: a consumer camcorder was used to monitor the rat's behavioral activity, and a black-and-white surveillance camera in another room was positioned in front of an oscilloscope displaying the neural data. Because the video mixer permits variable window sizes and positions, we placed the oscilloscope image in a region of the picture where the animal did not explore. Two LED counters were mounted above the oscilloscope display: one counted fields from the video mixer (thereby providing an accurate time-code identifying each video frame), and the second counted pulses sent from the data acquisition system indicating the number of the open data file.
Figure. 3. A. Schematic of the apparatus for combining and synchronizing rat behavior and physiological data. Physiological data was also recorded on the audio channels (dashed line) B. Video frame showing the perioral structures of the rat with an oscilloscope trace of neural activity and video field counter.
During the experiment, the video mixer allowed us to monitor the behaviors of the animal and the physiological data simultaneously, and in real time. After the experiment, we scanned through many hours of videotape examining correlated behavioral and neural activity. If a period of neural activity appeared particularly interesting, we could easily choose the corresponding digitized physiological data file to examine the responses in greater detail. Because the behavioral and physiological data were visually superposed, we were able to correlate high activity levels with behaviors involving the rat's upper lip. By examining this relationship carefully, frame by frame, we discerned that the granule cell activity was not correlated directly with the movement of perioral structures, but rather with tactile input to these regions (Hartmann and Bower, 1996).
We later analyzed the videotape on a Macintosh 6100AV multimedia computer, looking for interesting sequences of body positions and orientations to simulate the electric field (Assad et al., 1993). The video was visible in a MATLAB Figure window using chroma keying and Apple Video Monitor software (see Appendix 1). We first overlaid coordinate system axes on the video using MATLAB's graphics functions. On the overhead view, we placed a 3-dimensional coordinate axis using a flat (orthographic) projection along the z-axis. The side view suffered from perspective distortion (i.e., the fish's image became larger when it was nearer to the camera), which we compensated for with a perspective projection (MATLAB Reference Guide; Foley et al., 1990). The viewpoint and additional parameters defining this coordinate system were initially estimated from the camera location and lens focal length, and adjusted empirically by overlaying the coordinate axes with the tank edges seen in the video.
We rendered a 3-dimensional 256 node surface model of the fish with non-key colors in these two coordinate systems overlaying the video. With graphical controls, we adjusted parameters to translate, rotate, and bend the model until it was in register with the video images of the fish (Fig. 4B). Aligning the video image and model directly proved more accurate and quicker than trying to orient the model from feducial points on the fish's body. In fact, we could locate very few points consistently on the body, because of the low video image quality and the lack of visually distinct features on the dark fish.
Figure 4 A. Schematic of the apparatus for reconstructing trajectories and body orientations. Infrared-sensitive video cameras imaged the fish from two orthogonal angles. Their analog signals were mixed and recorded on a consumer VCR. B. Later, the videotape was viewed in a MATLAB figure window using chroma keying. A 3-dimensional wireframe model of the fish's body and the object (black) was superimposed on the video and translated, rotated, and bent (by the controls around the window sides) to coincide with the fish's body in the two video views. Note that the video contrast and signal to noise ratio were inherently poor because the water absorbs infrared wavelengths, which were used for illumination without introducing visual cues.
% make a palette of all 256 colors
c=hsv(256); colormap(c);
% draw all colors in a grid, filling the window
set(gca,'units', 'normalized', 'position', [0 0 1 1]);
image(reshape(1:256, 16, 16));
% identify the key color
disp('Place the Figure window over the Video Monitor window and click on the
transparent color');
click = round(ginput(1));
colorIndex = (click(1) - 1) * 16 + click(2);
% set the figure background to the key color
clf; set(gca, 'Color', c(colorIndex,:));
Figure 5. Printed circuit board layout for the video mixer.