Main Content

Using the Kinect for Windows V1 from Image Acquisition Toolbox

This example shows how to obtain the data available from Kinect® for Windows® V1 sensor using Image Acquisition Toolbox™:

Utility Functions

In order the keep this example as simple as possible, some utility functions for working with the Kinect for Windows metadata have been created. These utility functions include the skeletalViewer function which accepts the skeleton data, color image and number of skeletons as inputs and displays the skeleton overlaid on the color image

See What Kinect for Windows Devices and Formats are Available

The Kinect for Windows has two sensors, an color sensor and a depth sensor. To enable independent acquisition from each of these devices, they are treated as two independent devices in the Image Acquisition Toolbox. This means that separate VIDEOINPUT object needs to be created for each of the color and depth(IR) devices.

% The Kinect for Windows Sensor shows up as two separate devices in IMAQHWINFO.
hwInfo = imaqhwinfo('kinect')
hwInfo =

       AdaptorDllName: [1x68 char]
    AdaptorDllVersion: '4.5 (R2013a Prerelease)'
          AdaptorName: 'kinect'
            DeviceIDs: {[1]  [2]}
           DeviceInfo: [1x2 struct]

hwInfo.DeviceInfo(1)
ans =

             DefaultFormat: 'RGB_640x480'
       DeviceFileSupported: 0
                DeviceName: 'Kinect Color Sensor'
                  DeviceID: 1
     VideoInputConstructor: 'videoinput('kinect', 1)'
    VideoDeviceConstructor: 'imaq.VideoDevice('kinect', 1)'
          SupportedFormats: {1x7 cell}

hwInfo.DeviceInfo(2)
ans =

             DefaultFormat: 'Depth_640x480'
       DeviceFileSupported: 0
                DeviceName: 'Kinect Depth Sensor'
                  DeviceID: 2
     VideoInputConstructor: 'videoinput('kinect', 2)'
    VideoDeviceConstructor: 'imaq.VideoDevice('kinect', 2)'
          SupportedFormats: {'Depth_320x240'  'Depth_640x480'  'Depth_80x60'}

Acquire Color and Depth Data

In order to acquire synchronized color and depth data, we must use manual triggering instead of immediate triggering. The default immediate triggering suffers from a lag between streams while performing synchronized acquisition. This is due to the overhead in starting of streams sequentially.

% Create the VIDEOINPUT objects for the two streams
colorVid = videoinput('kinect',1)
Summary of Video Input Object Using 'Kinect Color Sensor'.

   Acquisition Source(s):  Color Source is available.

  Acquisition Parameters:  'Color Source' is the current selected source.
                           10 frames per trigger using the selected source.
                           'RGB_640x480' video data to be logged upon START.
                           Grabbing first of every 1 frame(s).
                           Log data to 'memory' on trigger.

      Trigger Parameters:  1 'immediate' trigger(s) on START.

                  Status:  Waiting for START.
                           0 frames acquired since starting.
                           0 frames available for GETDATA.

depthVid = videoinput('kinect',2)
Summary of Video Input Object Using 'Kinect Depth Sensor'.

   Acquisition Source(s):  Depth Source is available.

  Acquisition Parameters:  'Depth Source' is the current selected source.
                           10 frames per trigger using the selected source.
                           'Depth_640x480' video data to be logged upon START.
                           Grabbing first of every 1 frame(s).
                           Log data to 'memory' on trigger.

      Trigger Parameters:  1 'immediate' trigger(s) on START.

                  Status:  Waiting for START.
                           0 frames acquired since starting.
                           0 frames available for GETDATA.

% Set the triggering mode to 'manual'
triggerconfig([colorVid depthVid],'manual');

Set the FramesPerTrigger property of the VIDEOINPUT objects to '100' to acquire 100 frames per trigger. In this example 100 frames are acquired to give the Kinect for Windows sensor sufficient time to start tracking a skeleton.

colorVid.FramesPerTrigger = 100;
depthVid.FramesPerTrigger = 100;
% Start the color and depth device. This begins acquisition, but does not
% start logging of acquired data.
start([colorVid depthVid]);
% Trigger the devices to start logging of data.
trigger([colorVid depthVid]);
% Retrieve the acquired data
[colorFrameData,colorTimeData,colorMetaData] = getdata(colorVid);
[depthFrameData,depthTimeData,depthMetaData] = getdata(depthVid);
% Stop the devices
stop([colorVid depthVid]);

Configure Skeletal Tracking

The Kinect for Windows sensor provides different modes to track skeletons. These modes can be accessed and configured from the VIDEOSOURCE object of the depth device. Let's see how to enable skeleton tracking.

% Get the VIDEOSOURCE object from the depth device's VIDEOINPUT object.
depthSrc = getselectedsource(depthVid)
   Display Summary for Video Source Object:

      General Settings:
        Parent = [1x1 videoinput]
        Selected = on
        SourceName = Depth Source
        Tag =
        Type = videosource

      Device Specific Properties:
        Accelerometer = [-0.008547 -0.98046 -0.11966]
        BodyPosture = Standing
        CameraElevationAngle = 9
        DepthMode = Default
        FrameRate = 30
        IREmitter = on
        SkeletonsToTrack = [1x0 double]
        TrackingMode = Off

The properties on the depth source object that control the skeletal tracking features are TrackingMode, SkeletonToTrack and BodyPosture properties on the VIDEOSOURCE.

TrackingMode controls whether or not skeletal tracking is enabled and, if it is enabled, whether all joints are tracked, ‘Skeleton’, or if just the hip position is tracked, ‘Position’. Setting TrackingMode to ‘off’ (default) disables all tracking and reduces the CPU load.

The ‘BodyPosture’ property determines how many joints are tracked. If ‘BodyPosture’ is set to ‘Standing’, twenty joints are tracked. If it is set to ‘Seated’, then ten joints are tracked.

The SkeletonToTrack property can be used to selectively track one or two skeletons using the 'SkeletonTrackingID'. The currently valid values for 'SkeletonTrackingID' are returned as a part of the metadata of the depth device.

% Turn on skeletal tracking.
depthSrc.TrackingMode = 'Skeleton';

Access Skeletal Data

The skeleton data that the Kinect for Windows produces is accessible from the depth device as a part of the metadata returned by GETDATA. The Kinect for Windows can track the position of up to six people in view and can actively track the joint locations of two of the six skeletons. It also supports two modes of tracking people based on whether they are standing or seated. In standing mode, the full 20 joint locations are tracked and returned; in seated mode the 10 upper body joints are returned. For more details on skeletal data, see the MATLAB documentation on Kinect for Windows adaptor.

% Acquire 100 frames with tracking turned on.
% Remember to have a person in person in front of the
% Kinect for Windows to see valid tracking data.
colorVid.FramesPerTrigger = 100;
depthVid.FramesPerTrigger = 100;
start([colorVid depthVid]);
trigger([colorVid depthVid]);
% Retrieve the frames and check if any Skeletons are tracked
[frameDataColor] = getdata(colorVid);
[frameDataDepth, timeDataDepth, metaDataDepth] = getdata(depthVid);
% View skeletal data from depth metadata
metaDataDepth
metaDataDepth =

100x1 struct array with fields:

    AbsTime
    FrameNumber
    IsPositionTracked
    IsSkeletonTracked
    JointDepthIndices
    JointImageIndices
    JointTrackingState
    JointWorldCoordinates
    PositionDepthIndices
    PositionImageIndices
    PositionWorldCoordinates
    RelativeFrame
    SegmentationData
    SkeletonTrackingID
    TriggerIndex

We randomly choose the 95th frame to visualize the image and skeleton data.

% Check for tracked skeletons from depth metadata
anyPositionsTracked = any(metaDataDepth(95).IsPositionTracked ~= 0)
anySkeletonsTracked = any(metaDataDepth(95).IsSkeletonTracked ~= 0)
anyPositionsTracked =

     1


anySkeletonsTracked =

     1

The results above show that at least one skeleton is being tracked. If tracking is enabled but no IDs are specified with the TrackingID property, the Kinect for Windows software automatically chooses up to two skeletons to track. Use the IsSkeletonTracked metadata to determine which skeletons are being tracked.

% See which skeletons were tracked.
trackedSkeletons = find(metaDataDepth(95).IsSkeletonTracked)
trackedSkeletons =

     1

Display skeleton's joint coordinates. Note that if the 'BodyPosture' property is set to 'Seated', the 'JointCoordinates' and 'JointIndices' will still have a length of 20, but indices 2-11(upper-body joints) alone will be populated.

jointCoordinates = metaDataDepth(95).JointWorldCoordinates(:, :, trackedSkeletons)
% Skeleton's joint indices with respect to the color image
jointIndices = metaDataDepth(95).JointImageIndices(:, :, trackedSkeletons)
jointCoordinates =

   -0.0119   -0.0072    1.9716
   -0.0107    0.0545    2.0376
   -0.0051    0.4413    2.0680
    0.0033    0.6430    2.0740
   -0.1886    0.3048    2.0469
   -0.3130    0.0472    2.0188
   -0.3816   -0.1768    1.9277
   -0.3855   -0.2448    1.8972
    0.1724    0.3022    2.0449
    0.3102    0.0382    2.0304
    0.3740   -0.1929    1.9591
    0.3786   -0.2625    1.9356
   -0.0942   -0.0850    1.9540
   -0.1367   -0.4957    1.9361
   -0.1356   -0.8765    1.9339
   -0.1359   -0.9284    1.8341
    0.0683   -0.0871    1.9504
    0.0706   -0.4822    1.9293
    0.0858   -0.8804    1.9264
    0.0885   -0.9321    1.8266


jointIndices =

   318   256
   317   240
   318   143
   319    92
   271   177
   239   243
   219   303
   216   323
   363   177
   399   243
   421   303
   424   322
   296   277
   286   387
   288   492
   286   520
   340   277
   342   384
   347   493
   350   522

Draw the Skeleton Over the Corresponding Color Image

% Pull out the 95th color frame
image = frameDataColor(:, :, :, 95);

% Find number of Skeletons tracked
nSkeleton = length(trackedSkeletons);

% Plot the skeleton
util_skeletonViewer(jointIndices, image, nSkeleton);