In a race to optimize everything, developers often go to extremes to build software that performs routine tasks. MissionControl is a system that allows users to program a control center that stores interfaces with attached hardware sensors, allowing the users to control any other devices that can be activated via the underlying protocol. For demo purposes, the MissionControl build at this point is compatible with the Phidgets IR hybrid sensor.
The system has two core components:
- A server application, which is a Win32 console application that handles incoming queries and returns data to the connected clients. This application runs on the desktop machine with the connected sensor.
- The Windows Phone application that sends requests to the target server and can trigger a variety of pre-programmed commands.
Hardware and Communication Infrastructure
One of the most important parts of the project is the signal capture and replication hardware. For the purposes of this project, I decided to use a dual-mode Phidgets IR sensor. It supports both IR code capture and subsequent replication. From a user’s perspective, this device also eliminates a substantial code-learning overhead as well as the potential error rate. Instead of searching for a device-specific hexadecimal sequence that later has to be transformed in a working IR code, the user simply has to point his remote control at the sensor and press the button that he wants accessible from a mobile device. Given that the capturing software is running on the target machine, once the sensor detects that a code can be repeated within an acceptable precision range, it will be automatically captured and stored, with all required transformations worked out in the backend using the free Phidgets SDK.
Even though I can, I don’t have to handle the binary code content received through the sensor—the Phidgets .NET libraries carry built-in types that contain all the processed metadata that I will discuss later in this article.
This sensor is connected through a USB port to a machine that acts as a communication gateway. This server should have port 6169 open for inbound connections.
NOTE: The port number can be changed, but you have to keep it consistent between your server and client applications.
The communication between the phone and the computer running the client is performed via a TCP channel — sockets are used to perform the initial connections and serialized data transfer. You can see the generalized data flow between the devices that are involved in the procedure in the graphic below:
The server (desktop client) handles the local storage and release of all incoming IR codes. The mobile client has to know the location of the server—once specified and confirmed, it can send one of the pre-defined commands to it and either query the server for existing command groups (sets) or invoke one of the stored IR codes. When I pass data between devices, I use JSON for the serializable components. The data is also processed before being sent in order to speed-up the process—for example, on the server side the sets are serialized together with the associated codes. Like this:
The inherent problem with the JSON data above is the fact that the phone client does not need the information related to the code binary sequence and all the metadata that goes with it. So it is effectively stripped down and reduced to the names of the sets (when a list of sets is requested) and commands (when a list of commands is requested).
The Data Model
As you saw from the description above, the server organizes individual infrared codes in sets. A single set is a bundle of codes that may or may not be related to each other—ultimately, this is the user’s decision. A good example of using sets is organizing IR commands by rooms, devices or code types. Each set has a unique name on the server, therefore eliminating the possibility of a request conflict.
Each set stores individual commands built around the Command model:
Despite the obvious Name property, you can see that I am using a
SerializableIRCode instance that is specific to each model. Before going any further, I need to mention that the Phidgets SDK offers the
IRLearnedCode model to store code contents. I could have used it instead, but there is an issue that prevents me from doing that — there is no public constructor defined for
IRLearnedCode, therefore there is no way to serialize it, either with the built-in .NET serialization capabilities or JSON.NET, which I am using in the context of the project.
Instead, I have this:
It is an almost identical 1:1 copy of the original class, storing both the layout of the IR code and additional information related to its replication mechanism. You can learn more about each property listed in the model above by reading the official document on the topic.
ToggleMask, the identity bit carrier that helps marking the code as repeated or not, is also implemented through a built-in Phidgets SDK model, and it has the same problem as
IRLearnedCode. I implemented this model to replace it in the serializable code:
I also needed an easy way to store all sets at once and carry all associated codes in a single instance retrieved from the storage. Here is the
Notice that there is an
IsList flag that allows me to specify how to display this specific list on the connecting device. This adds some level of flexibility for situations where the user wants to build a virtual remote for closely-related keys, such as digits. With that in mind, displaying those as a list might be inconvenient, wasting visual space on the client. But if the flag is set to false, the list can be displayed as a pad.
Also, when the server performs the data exchange, it provides a single “envelope” that allows the connecting device to easily understand what the server is trying to do:
The Identifier property carries the server IP address. That way, when a device receives a response, it is able to either accept it, because it knows that a response is requested from a target location, or discard it because the user is no longer using the specific server.
Marker carries the command type of the sent command, therefore giving the Windows Phone application a hint as to what to do with the data. The server can send the following commands:
SET_LIST– returns the list of sets that are currently available on the server.
SET_COMMANDS:SET_NAME:IS_LIST– returns the list of commands that are associated with a given set that is currently stored on the server.
NOTIFICATION– send a simple notification to the client; no further action is required.
Last but not least,
Content is used to push the necessary data that is associated with the given
Marker. It can be either a JSON-based string that lists the sets or commands, or a plain-text message that is used as an alert for the end-user.
The server is the only component of this entire system that does all the heavy lifting. It learns commands, stores them and then generates new IR signal requests, as controlled from any of the connected clients. Let’s take a closer look at what happens behind the scenes — to start, I am going to document the network infrastructure.
The Network Layer
In order to be a reliable system, the server needs to be always ready to accept an incoming connection. For that purpose, it is possible to use the
TcpListener class — an “always on” receiver that can handle incoming TCP connections. I integrated it in my
CoreStarter class that is used to start the listener when the application is launched:
LaunchSocket is called, the listener is activated on the current machine. As I mentioned above, the port number can be arbitrarily assigned, but has to be consistent between connecting apps in order for the TCP links to be established. Because I expect that more than one device will be connecting to the service at a time, the listener is set as active across a constant number of threads.
NOTE: By default, there is a maximum limit of 5 simultaneous clients. Although this number can be adjusted, be aware of the requirements of each environment in which a limited number of potential devices can connect. Even though the performance footprint of each thread is minimal, it can have a negative effect if used in unnecessarily large instances.
ListenForData is used to read the incoming stream. When an inbound connection is accepted, the data is read with the help of a fixed content buffer. Then a read timeout is specified to prevent situations where the stream was completely read but the application still waits to pull non-existent data. Once the timeout milestone is hit, an exception is thrown, which marks the end of the stream—at this point, the plain text data that was received (remember that both the server and client exchange text data only) is passed to the command interpreter —
CommandHelper, with a reference to the source of the command.
The commands from the device are passed as serialized key-value pairs (
KeyValuePair<T, T>), the key being the command with any possible suffixes, and the value being the contents of the command itself that helps the server identify the specific item in the local storage.
InterpretCommand, in this case, does three things sequentially:
- Deserialize the incoming string and create a
- Process the command and check whether it is recognizable.
- Send a response to the client, if deemed necessary by the command type.
The serialization and deserialization is done via JSON.NET. You can install this package in your console managed Win32 project and the Windows Phone application project via NuGet:
The deserialization step is as simple as one line of C# code:
The string is sanitized to ensure that only JSON content is being passed to the serializer.
Because of a relatively limited command set, I can put together the entire interpretation stack like this:
All commands are constants, declared in the local helper class:
Notice that these are not the commands that the server sends back, but rather the commands it receives from connecting Windows Phone devices.
Let’s now take a look at the breakdown for each command.
When this command is received, the server does not have to do much processing. It is only invoked when the client establishes the initiating link and needs to know what possible sets it can get from the target machine. The request is logged in the console and a server response is prepared that contains a serialized list of set names, which is later serialized as well and sent back to the source machine location.
NetworkHelper will be documented later in this article.
When a mobile device attempts to create a new set on the server, it sends a command in the following format:
CreateSet will get the type of the set that was created, will check whether a set with the same name already exists and will either create it or ignore the command altogether. No notification is sent to the connecting device, but either the failure or the success of the command is registered in the local console.
Commands are sent in the same manner as sets—once the set is recognized, the names of the associated commands are retrieved and serialized inside a
ServerResponse instance and then pushed back to the requesting device.
Once a request was received that the server needs to learn a new command, an initial verification is done to make sure that the requested command name and set are not already taken. If neither the command nor the set exist, both will be created.
After the basic setup is complete, the IR sensor is activated and will be waiting for the command to be learned. The way it works is quite simple – the sensor will remain in learning mode until the point where it recognizes a command without error, being 100% sure that it can be reproduced internally. You will need to point your remote towards the sensor and hold the button you want captured for one or two seconds in order for the command to be learned.
NOTE: To ensure that a proper transmission is done, I manually set the minimal repeat value to 5. This is the number of times the sensor will fire the same code towards the target. That is the optimal value for a target device to receive the code if the remote is pointed directly at it without necessarily triggering the same command twice or more.
After the command is learned, the code is processed and transformed into a serializable instance. The connecting client is then notified about whether the command was learned.
Command execution relies on the hardware sensor. The phone sends a command execution request in the following format:
Once the command is parsed out and found in the local storage, the IR code is transformed back to a model that is recognizable by the Phidgets SDK and transmitted towards the location where the sensor is pointed at the time of the execution.
When deleting a set, only the name of the set should be specified. The user will get a warning on the client side that requires a confirmation of the deletion. The server will blindly execute the command.
Not only can the user remove entire sets, but he can also target specific commands from a given set. Once a
DELETE_COMMAND directive is recognized, the set name is parsed out from the original string, that follows the
DELETE_COMMAND:SET_NAME, COMMAND_NAME format, and a simple LINQ query extracts the command instance, removes it and stores the set content on the local hard drive.
Notice that for some commands, particularly for set creation, deletion and command deletion, the server will return a list of the remaining items. The contents will be automatically updated on the devices, which will be waiting for that response. This measure was deliberately introduced to minimize the chances of a user triggering a command that was already deleted or trying to query a previously removed set.
You might have noticed that I am using
IRCodeWorker.GetSerializableCodeType to transform a Phidgets SDK native IR code model into a serializable one. This is a helper function that performs a field copy of the existing object. Because of the differences in the model structure, it has to be done manually:
The reverse process is easier because I can pass each of the existing properties to the IRCodeInfo constructor. The only difference is the fact that I need to use Reflection to create an instance of IRLearnedCode because there is no public constructor defined and a dynamic object has to be created:
Command and Set Management
Looking back at the code that I put together for the command interpreter, there is one class that does all local content manipulation—StorageHelper. This is a simple class that performs LINQ queries on set as well as command collections, and makes sure that all the changes are preserved in the sets.xml file in the application folder that is used as the only storage place for all the content that is being manipulated by the server.
Sending Data Back to the Client
SendData in the
NetworkHelper class handles all outbound connections. Here is its structure:
A new stream socket is created in order to connect to the target machine over the TCP pipe. If IP sanitization is enabled, the port is stripped from the address in order to pass a valid IP. A
Socket instance cannot directly handle IPs of the format:
Later, in a synchronous manner, a connection is established and the data is sent.
At this point, you can see that the barebones service offers a flexible way to manage content. It can be accessed by any application type as long as the server can be accessed and the application can send commands in the pre-defined format and the content requested is actually located on the target server. This allows for high levels of extensibility and interoperability, as the server usage is not limited to a single platform. If I decide to create a Windows Store application that would allow me to control my TV, I simply need to add socket connection layer that will send plain strings to the machine where the IR sensor is connected.
Similarly, if some functionality needs to be added, it is possible to do so without ever touching the client applications. A modification in the endpoint will be reflected with no direct effect on all connection applications as long as all handled returned and requested values are preserved. The only additional requirement is that if the client applications want to take advantage of newly introduced capabilities, they need to have an updated command transmission layer for the new command types.
Program.cs, I simply need to start the server through the
Mobile client overview
The mobile client does not have the capability to send commands directly to the IR sensor. Instead, it connects to a remote machine that has the IR sensor plugged in and attempts to invoke a command from the list returned by the service. A single mobile client can support control over multiple servers.
NOTE: Make sure that at the time of working with the Windows Phone client, the server is actually running on your local machine. To make it easier to test, also open port 6169 for incoming connections in Windows Firewall.
The Windows Phone application also relies on a network infrastructure somewhat similar to that of the server. There is a TCP listener that is created when the application is started:
Here, listener is an instance of
TcpSocketListener — a custom class designed to handle incoming network connections:
StreamSocketListener is used for the connection core. When a connection is received, a continuous loop reads the entire contents of the incoming stream.
OnConnectionCompleted is declared in the base class —
ConnectionEventArgs here is used to identify the content that is passed to the client.
DeviceID gives access to the source IP,
IsSuccessful tells the developer whether the established connection is active and the Token carries the raw string if any was received.
Sending data is simplified to the maximum with the help of the
SocketClient class, which relies on a
StreamSocket instance that handles outbound connections and writing to the output stream:
As with the listener class,
OnConnectionCompleted to notify the application that the connection attempt completed.
App.xaml.cs, the data from the incoming connection captured by the
TcpSocketListener instance is passed to the
This class reads the possible three commands sent by the server and interprets them, creating internal collections from the raw data if the current server IP matches the one obtained in the
ServerResponse (the same model in the desktop application):
If the response comes from a server that is different than the one that is currently active, the data is discarded as the user no longer needs it. Also, for specific commands, the mobile application will be on standby, waiting for a response (unless the user decides to cancel the request) – the IsWaiting flag is an application-wide indicator that a pending server action is in the queue.
Same as with the server, the commands in the Windows Phone application are represented through pre-defined constants:
Let’s now take a closer look at how it is handled internally to build the visual layer.
Handling the Data
The first thing users will see when the application is launched is the list of registered servers:
ServiceListPage.xaml. The list of servers that were added is retrieved from the isolated storage on application startup, with the help of the standard serialization routine implemented in the Coding4Fun Toolkit — specifically, its storage subset (you can get it via NuGet):
The one-liner that initializes the internal server collection is as follows:
SERVERS_FILE constant is equal to
servers.xml. It is a good idea to use constants for file names in order to be able to later modify the location through a single change instead of digging through the many source files in a solution to find references to the old location.
The user can define an unlimited number of servers, as long as he can access those. There is no restriction on the location of the server itself — it can work with the desktop in your room just as well as with a PC on the other end of the world (yes, this was tested).
When adding a new server, the user is redirected to
AddServicePage.xaml, where he can fill in connection details, as well as the location of an image that would help him identify that specific item in the general list:
Once data entry is complete, it is validated internally to make sure that the server is not already registered with the same name and location. If the validation step passes, the server is added to the list of local access points and the user is returned back to the server selection page:
When a server selection is made by the user, it is necessary to show
SetsPage.xaml. However, it is necessary to also check whether the server is active or not prior to the actual navigation. With the help of internal bindings, I am doing it through a
COMMAND_SERVER_HELLO represents the initial handshake command that I mentioned earlier—it requests the list of sets on the target server. To streamline command processing,
CommandClient is used and wraps around the
SocketClient class, giving me the possibility to call SendCommand with the command metadata without having to explicitly handle socket interactions in my views:
From here on,
ResponseHelper is once again involved, grouping all the data alphabetically — remember this call:
The grouped collection is later bound to a
For each handshake call to the server, the set collection will be re-initialized, in case the server was updated by another device while the user was not taking any actions.
Adding a set takes the user to
AddSetPage.xaml, where the user input is once again validated and the appropriate command sent to the currently selected server:
The end-user is also able to specify whether the new set is a list or a pad. Since the server does not explicitly define the type of a set beyond marking whether it’s a list, it is possible to have an arbitrary type here.
To give you an idea of what it looks like in the current release of MissionControl, here is the pad representation of a set of commands:
It is a convenient way to display buttons for typical actions, such as channel switching through digits. Since we can safely assume many of those will be tapped sequentially, a list would be inconvenient to scroll through.
On the other hand, some remote control commands work well with a list because no sequences are invoked most of the time:
If the pad is not desired, it can easily be swapped with another design and internal template - the appearance is swapped dynamically and is not hard-bound to a string value.
Once a set is selected, a connection attempt is made to the current server in order to check whether there is still a communication channel available with the resource that fetched the initial list of commands. If a connection is established, the server will also return a set of commands that are available in the set at the time of the request.
You’ve probably already noticed that both for commands and sets, the initial routine verifies the connection to the server. The server might go dark after the set list is loaded, therefore rendering any attempt to process other commands impossible. To avoid scenarios in which the user is waiting for a response from a server that doesn’t run, the user is notified before being redirected to the subsequent view, if the connection fails. That way unnecessary navigation passes are out of the picture.
If the user selects a command from one of the lists demonstrated above, an
EXECUTE directive is issued via the
Once the server receives the command, it will send it to the target without additional notifications being released to the connecting client.
When it comes to learning a new remote control code in
LearnCodePage.xaml, the procedure is exactly the same as with any other part of the server communication process — a
LEARN_CODE command is sent to the server with the associated set and new command name, and the server will wait for incoming IR input, leaving the connecting device free (no waiting lock is issued):
Once the server learns a new command — if, and only if, the user still works in the context of the same server — an alert will be displayed, telling the user whether the command was successfully learned.
For convenience purposes, I also implemented a quick launch panel, where frequently-used commands can be placed. Whenever a user wants to add something here, he will tap-and-hold on an existing command in any of the sets that are available for any given server, and select the “add to quick launch” option. Once completed, the stored commands will be available on the main page, even when the user is not directly connected to the server that carries the command:
Because this interaction layer is placed outside the boundaries of a single server or set, I needed to create a special data model to store the quick commands and the related connection information, that would let me call the server even when it is not the currently selected one:
Same as with the list of servers, the list of favorites is deserialized on application startup:
Logically, we would also need to have a way to eliminate trailing commands for servers or sets that have been removed, since those can no longer be invoked or might have a different meaning on servers that were added and have the same IP as the previous owner. This is easily done with a simple LINQ expression that is passed to
RemoveTrailingFavorites in the
A typical usage scenario is reflected in the server removal snippet:
ObservableCollection<T> is used for both the list of servers and quick launch commands, the view will be instantly updated to reflect the changes.
Improvements to the project
This specific project relies on a hybrid IR transmitter and receiver, which is not exactly cheap. As a step forward for this project, it can be adapted to use a central microcontroller that acts as a server (e.g. Netduino) and a series of IR emitters (instead of using a composite receiver/emitter) connected to it. Reduced cost for the IR infrastructure is key, as not every single component needs the capability to learn IR commands. You can have a single command capturing endpoint and multiple transmitters. This will also eliminate the need for a desktop client, since the server on the microcontroller can be built to be accessible via a web-browser.
Another important aspect not covered in this article is security. With the current workflow, anyone who has direct access to the server IP is able to do anything he wants with the data handled by the server. I am basing my writing on the assumption that you are testing the application on a secure local network and that the the odds of something like this happening are close to zero. However, for other environments where tampering with a server might be unacceptable, consider implementing a layer of security between the server and the client.
With affordable microcontrollers and sensors, home and office automation can be a nice bonus resulting from little investment. This article covers the implementation of a proof-of-concept server and application that can be easily extended and adapted to a variety of environments and devices.