I do really like the Microsoft Kinect – not really as a games device at the moment though; rather as a tool for the many, many ‘hackers’ out there that want to see what things can actually do and take them to the extreme. And, as discussed both on this blog and also on the podcast, they’re quite good and have huge potential.
However, I am here today to talk about something which isn’t spoken about much – but the video conferencing application of this technology. And, more specifically how this may be able to increase the use of the technology in regions with rather poor Internet speeds.
For what you get, it is relatively inexpensive; with other, ‘professional’ unites costing in the region of £6000 (though, these are a lot better). So the £130 seems rather pathetic – though we have discussed on the podcast how they really cost in the region of £30 to £50 to manufacture the parts. So, this is the first hurdle out of the way for providing communities around poor regions of the world with this technology.
The whole idea of the infra-red camera and projector is that they work together to map the room out – using the ping times of reflection can show the dimensions of the room and then map it out into a three-dimensional plain. Thus, allowing you to work out what is the background and what is in the foreground.
You may be wondering where I am going with this – but if you can work out what is in the background you can work out what is not relevant to image needed within the video-conference (this is working on the assumption that background information is less relevant). So if you get this information, and lace it together with the image you can then start to process the image’s quality based on priority.
With this processing you could then help to reduce the bandwidth needed to send the image over the network. For example, the background isn’t really that important in most circumstances – so why bother sending this to the end-client in a high quality. You could just display this in a very low resolution (just enough to ensure that the end-client can have a very basic idea of what’s happening). Then, for the person who will be closest to the camera you could then send that information in an appropriate resolution.

- Example of depth rendering.
With the reduction in the un-important parts of the image resolution – it would reduce the bandwidth needed tremendously. Thus, allowing video conferencing in situations where it wouldn’t be possible usually.
Another use of looking at the depth of field is also a layer of privacy. Why do the people who are receiving the video (the end-client), need to see the background – specifically if the sender of the video doesn’t want the end-client to see anything in the background.
As with these applications, it will look rather terrible to begin with. Both the RGB camera within the Kinect unit and the resolution of the IR-Blaster/Receiver so the images will be kind of broken at the edges (where it needs to implement the change in render patterns), and the processing required, though not much, is still another issue. For example, if this was to be implemented in third-world countries you would need to reduce the processing needed in order for it to maintain the cheapness-factor.
I believe that Kinect isn’t really going to be a gaming platform for very long. It is kind of awkward to play games made for it, and there is then inevitable lack of space problem (and the over furnished rooms we all have). But rather I can see it been sold for other purposes for the PC platform. Situations such as the one described above and many, many more make the device seem all of a sudden more practical to me.