Blog | Combine

# Blog

In recent years, the hype around artificial intelligence (AI) has grown a lot. AI is not a new concept and the term has been around since the 1950’s when computer scientist John McCarthy coined the term. But the concept of creating artificial beings has been around much longer and can be found in greek mythology or in Mary Shelley’s Frankenstein.

AI has also been portrayed in numerous books and movies and some are more interesting than others. There are great classics like Stanley Kubrick’s 2001: A Space Odyssey (1968) inspired by a short story by Arthur C. Clarke. In the movie the ship’s computer HAL (Heuristically programmed ALgorithmic computer) 9000 is the main antagonist. This is one of the first movies that portrayed the idea of a human made AI to the masses.

Another classic movie is Blade Runner (1982) based on Philip K. Dick’s novel Do Androids Dream of Electric Sheep?. In the movie humans have designed intelligent androids, called replicants, that the main protagonist, Rick Deckard, must hunt down and retire (terminate).

One common problem in movies, and sci-fi movies in general is how to balance an intricate and interesting story with special effects and action scenes. The sci-fi genre has so much potential when it comes to explore concepts about science, technology, existentialism, human evolution and the possibility of extraterrestrial life. When the boundaries of our current paradigm is not a hard limit, imagination and science can blend in the most interesting ways. The movies and TV shows below are not based around special effects or action but rather trying to explore what it is to be human.

#### Her (2013)

(image from themoviedb)

Director: Spike Jonze
Starring: Joaquin Phoenix, Amy Adams, Rooney Mara, Olivia Wilde, Scarlett Johansson
Genres: Sci-fi, Romance, Drama

Synopsis: Joaquin Phoenix plays a man who installs a new operating system with artificial intelligence to help him with various tasks and over time their relationship becomes more and more romantic.

Being able to convey a romantic relationship without the need of a materialized body, using only a voice, is beautifully executed. Most other movies that explore romantic human/machine involvement are doing so almost by cheating. Using an android, more or less indistinguishable from a human, makes it much easier to relate to. Human/computer interaction by voice is not something new and has been around for a quite some time, mostly as an accessibility tool for people not able to use standard equipment. In October 2011, Siri was launched (by Apple) as the first smart phone integrated voice controlled virtual assistant. At the time, the actions it could perform was rather limited, problems understanding voice input and could be seen as more of a gimmick than a valuable tool. Since then other companies have released voice controlled virtual assistants and the biggest competitors are Amazon Alexa released in 2014 and the Google Assistant released in 2016. These products are getting better and better and voice controlled virtual assistants are probably here to stay although they are not perfect.. yet.

Her, artificial intelligence and the concept of time

Humans, like animals are evolutionary equipped to experience some basic aspects of time. What happens when an artificial intelligence emerge with the computational power of living years, decades or millenniums every day, hour or second? Their concept of time will be something completely different which could have huge implications. For some perspective, OpenAI Five  recently competed with five bots against five former professional gamers in the competitive game Dota 2. Everyday the AI played 180 years worth of games against itself running on 256 GPUs and 128 000 CPU cores. When summing up each character (a total of five) it amounts to 900 years every day. Time is relative but even more so for an AI running on a giant cluster and there is a great scene about this in the movie.

#### Ex Machina (2014)

(image from themoviedb)

Director: Alex Garland
Starring: Domhnall Gleeson, Alicia Vikander, Oscar Isaac
Genres: Sci-fi, Drama, Mystery, Thriller

Synopsis: Meet Caleb Smith, a talented programmer at a large tech company, who wins an office competition to stay a week at the CEO’s remotely located house. When he arrives he is introduced to an android who according to Nathan Bateman, the CEO, has passed the Turing test but he wants more validation before going public with the news. Surprise, surprise.. Caleb starts to develop feelings for the android.

How close is humanity to develop an human-like conscious AI and what happens then?

What happens then?, is a question people have been debating for a long time. Some view this as the inevitable future and hope for the human race and others see it as humanity’s demise. This has also been explored in sci-fi books and movies over the years and The Terminator (1984) portrays one of the darker scenarios for mankind. But how close are we to develop an AI able to pass the Turing test? Probably not very close. Even though great progress has been made in the last few years, with everything from AlphaGo being able to beat the top players at Go to Libratus a poker AI able to beat top players in heads up no-limit Texas hold ’em (and of course the OpenAI Dota 2 bots mentioned earlier), the Turing test is something else to beat. The AI’s created today are heavily specialized and contextual but can’t do much outside their specialization. And in order to pass the Turing test an AI would have to behave like a human when exposed to a multitude of different questions that could range from anything to everything.

The idea of sitting down with an AI, like Caleb, having a conversation and slowly become more and more amazed how well it performs, is thrilling. This is something else than the Turing test because it should not be known beforehand whether it is a human or not, but that doesn’t make it less interesting. Problems will arise when the AI’s starting to become too humanlike and at the same time have their own agenda. Sooner or later it would become really difficult to be able to control all the possible outcomes when dealing with an AI.

#### Black Mirror (2011)

(image from themoviedb)

Creator: Charlie Brooker
Genres: Sci-fi, Drama, Mystery, Thriller

Black Mirror is a TV show exploring new technology, society and possible scenarios from today to a more distant future. It has been called a modern day The Twilight Zone (1959), also a great show but many episodes can feel rather outdated. Every episode is standalone and explore a new theme. This means that viewers don’t have to see the episodes in a chronological order and can cherry pick the ones that seems most interesting, and even skip the first episode in the first season. This episode is mostly built on shocking the audience and is not representative for the rest of the show. Some favourites:

The Entire History of You (2011)

Synopsis: Some people have an implant recording everything they see and hear.

Think Google Glass but built into the body and it is not hard that this could create all kinds of problems. At the same time a lot of people were really excited about the idea of wearing glasses recording everything and honestly it is really not that far from our current reality.

San Junipero (2016)

Synopsis: Two women meet in a California-esque small town named San Junipero in 1987.

This is not a story about relationship with an AI. This is about human relationships, love, consciousness and how future technology could change our lives for the better.

Hang the DJ (2017)

Synopsis: In a maybe dystopian future two people, Amy and Frank, try out a dating/matching service that always puts an expiration date on all relationships. The reason for this is that each short relationship will give “the system” more data to eventually find the optimal match for each individual using the service.

So what if you could see the expiration date for a relationship, is it something you really want to know. And how would this knowledge change your actions? It is not that far from the classic theme of knowing the day you will die.

You have been working with electrical systems and software in vehicles your entire career. Why?

Actually, I’m not into cars or any other vehicles specifically. What drives me is mainly to solve problems in a challenging technical environment. It might as well have been airplanes or something else entirely. It just so happened that cars and buses had the right combination of technology and challenges to attract me.

How has your career developed at Combine?

The first step was into the telematics area. From there I moved on to Infotainment, and currently I am helping a customer with IT and processes relating to software development and software management.

When Combine sends you to help a customer, what can the customer expect?

I have realized that I have a knack for understanding how the processes, support systems and organization in a company are meant to facilitate technical development. Once I understand this, I bring out my broomstick and begin clean-up operations so that things work the way they are supposed to. Consequently, my CV is full of activities such as ”responsible for project documentation”, ”task force leader”, ”team leader” and similar. In my current assignment I also have the opportunity to develop improvements and IT that raise the quality of the customers processes.

If you could choose a completely different assignment, what would it be?

I believe that we have the technology, or the ability to develop it, needed to help solve some of the big issues facing us globally, issues like our impact on the climate. Creating the right incentives and mechanisms, as well as developing the solutions themselves, would be really stimulating and interesting.

It certainly does. I have been brewing beer for many years and I finally started a microbrewery called Sad Robot Brewing. Being who I am I tried to learn as much as possible about the engineering side of all the steps in brewing processes, such as chemistry and thermodynamics. Just like at an assignment I like things to be clean and controlled, so the only solution was to team up with some friends and do it ourselves. It was just like a second job. I spend less time on brewing nowadays, but one of the things I have done lately is to mentor a thesis project at Combine aimed at controlling and monitoring the brewing process (Editor’s note: you can read about this in the Combine blog here).
I also do some acting in theatre and movies, so not everything is technical. And yes, you can probably figure out what kind of books and movies I like from the name of the brewery.

#### Image processing in Sympathy for Data

This is the second blog post in a series of posts on image processing using Sympathy for Data, an Open-Source tool for graphically programming data-flows. See the previous entry for an example of how you can read the time from an analog clock using only basic image processing building blocks. No programming required.

When it comes to object recognition today most people think about deep learning and throw vast datasets onto deep machine learning algorithms — hoping that something will stick. One thing that all such algorithms have in common is that they all have a large number of parameters, requiring an even larger number of examples to be trained. There are two major costs associated with this approach: firstly the computational cost in training the datasets, usually using a single or a cluster of high-end graphic cards; and secondly the difficulty in acquiring large enough datasets to do the training with.  Sure, there exists techniques for artificially extending existing datasets into larger ones in order to help against over fitting, but even these cannot handle the case of datasets with only a hand full of examples. With all the hype of deep learning it is easy to forget that earlier approaches to object recognition, while much more limited in what they could solve, did not suffer from these difficulties and can sometimes still be favourable to be used.

If we look back at when image recognition was first considered as a problem to be solved with computers we see that the problem was at-first greatly underestimated. Back in the summer of 1966 a very optimistic project was started at MIT using only the student summer workers that year and with the aim of solving the computer vision problem. As you can read in the PDF the final goal was, in hindsight, a quite ambitious one indeed:

“The final goal is OBJECT IDENTIFICATION which will actually name objects by matching them with a vocabulary of known objects”.

Needless to say, this task proved more complex that what was first imagined, and have since led the the creation of a whole field of research. It is not until recently, more than 50 years after that summer project that we can say that general purpose object recognition is a more or less solved or solvable problem.

In my previous image processing post we looked at a simple image processing task in reading the time from an analog clock, and showed how this could be solved using the image processing tools available in Sympathy for Data, all without having to write a single line of code.  A major factor in this solution was by limiting ourselves only to images acquired in a very specific way. This solution generalizes more to industrial image processing such as eg. reading a pressure valve rather than doing general purpose like reading like a random clock you find on the side of a building.

In this and the upcoming image processing post I will show how we can use the image processing tools and the machine learning tools of Sympathy to similarly solve an object recognition task under well defined circumstances. These circumstances generalizes again more to an industrial setting, such as analysing objects on a conveyor belt, where we can have a clearly defined environment and camera setup.

For this purpose we will have a camera mounted straight above the incoming objects. The objects are photographed against a neutral background (white) clearly distinguishable from the objects themselves (metallic grey). Furthermore we ensure that the lighting is smooth and even over the whole area and that no sharp shadows are cast by the objects themselves or anything else. In the example dataset used here we use pictures of a mix of fasteners, with the target of identifying the screws. Furthermore we ensure that objects are overlapping since it would require more advanced techniques to separate overlapping objects,  a problem almost as hard as object recognition itself. If we would like to do this in an industrial setting we could use a mechanical solution to ensure this before the objects enter the belt, eg. using a suitable hopper.

#### Segmenting the image

We will start by solving the problem of segmenting and labelling an input image, with the task of deciding which areas of the image correspond to different objects.  The intention here is to pick out individual objects and to classify each found object whether it matches the target object.

Thus our workflow will contain the following steps:

1. Separate the image into pixels that belong to objects or to the background
2. Cleanup this image to remove noise and to completely close all objects
3. Create labels for each pixel
4. Extract a list of binary image masks, one per found label.

A typical step in many image segmentation tasks is to use a simple thresholding algorithm. We can use simple thresholding and the fact that the metallic grey objects all are darker than the background paper in order to create a binary representation of the pixels that belong to objects. We start by attempting to use a simple basic threshold at the value 0.5.

Note that we added a filtering step that inverts the image by scaling it by a factor of -1 and adding an offset 1 to it before we do the thresholding. Thus we can ensure that a completely dark pixel (value 0) becomes 1.0 before thresholding and is classified as a “true” boolean after the thresholding.

We can also note that the result of the basic thresholding is quite poor, We incorrectly classify the bottom half of the image as belonging to an object. If we raise the threshold until no background is classified as an object, then we instead start losing pixels from the objects that are classified as background. You can see this effect in the images below, where we have a higher threshold on the right side than on the left side.

Furthermore, just using a simple scalar value as a hard-coded threshold will not work very well if there is even the slightest change in global illumination from picture to picture.

We can use one of the automatic thresholding algorithms that automatically finds a scalar suitable for thresholding. The simplest automatic thresholding algorithm is the mean or median which sets the threshold such that half the image will be True and half the image False. This is however seldom good, and most definitively not good for our application since we are almost guaranteed that background (which is more than 50% of the image) is classified as part of the objects.

Other alternatives to automatic thresholding include a number of algorithms that consider the overall distribution of pixel values and tries to find a suitable threshold. For example the Otsu algorithm assumes that the pixel values follows a bi-modal distribution and find a global threshold that minimises the variance within each found class.

The results of Otsu is surprisingly good for most images, as you can see in the image above. However we note that this algorithm still misses some parts of the objects (see the upper edge of the circular washers in the image above). Sometimes, it is impossible to get a good enough result by just setting a single global threshold value.

Other alternatives exists that perform an adaptive threshold that considers a window around each pixel and calculates a threshold value for that pixel based on this window. With this technique we for instance can easily compensate for any unevenness in the overall lighting.

One example of this is an adaptive gaussian thresholding method. Here we first perform a low pass filtering with a gaussian kernel of size 21 and sigma 11. We take the lowpass filtered value and apply an offset (-0.01) before testing if it is higher or lower than the pixel that is being thresholded. We picked the value for the kernel size based on the overall size of the objects (the circular ones are approximately 20 pixels wide). The offset compensates for small irregularities in the background itself.

The noise on the background can be removed in a later stage using morphological opening. Before we progress to this however we consider one more approach which is to instead extract all the edges in the image and to perform morphological operations to close the objects based on the edge data.  We do this by applying a Canny edge detector to the raw input image (no pre-scaling step needed anymore). As we can see below this method generates no false positives and does capture all sides of the objects.

The interior of the objects can filled in using morphological closing after the Canny edge detector. What this does is to perform to perform a dilation operation followed by an erosion operation where the dilation makes all objects “thicker” by a given radius and the erosion makes them correspondingly “thinner”. Each of these operations are done by checking a neighbourhood around each pixel and taking the MAX or MIN value in the neighbourhood, respectively.

Consider the image on the left side below. In this image if we perform dilation then we get a white pixel in the areas marked red and green and only the area marked in blue would get a black pixel. If we instead perform erosion then we get black pixels in the red and blue areas and only the green area stays white. In the right side of the example below we can see the result of performing the erosion operation followed by a dilation operation. It has first made the white objects significantly thinner, followed by thicker.

For many objects making them thicker followed by thinner would not change the overall shape of the object. However, if two edges both become thick enough to touch each other then there is no black areas in the middle that can make them thinner again. Thus the end-result is that the objects have been closed as can be seen in the images below:

One problem that we can spot with the morphologically closed image is that some objects are now touching each other due to the thickening radius being larger that the distance between the objects which have created small bridges between some of the objects. To compensate for this we can perform a morphological opening that removes the small bridges between the objects. This step also removes all the small dots of false positives given by the thresholding algorithm if that one is used instead of the edge detection.

For the final step before we can start working with the objects it to use labeling to create a unique ID for each object. The labeling algorithm takes a binary image as input and creates an image with integers for each pixel. The integer values of a pixel correspond to a unique value for each object. If there were even a single pixel linking two objects to each other then both objects would be assigned the same integer value. We can visualise the result of this step by clicking on the object, this gives a pseudo-colour for each object based on a default colour map.

Note that since objects that are close to each other have similar ID’s then they are mapped to almost the same color. The ID values assigned differs even when not evident in the image below:

One final node that is useful is to create a list of all the found objects. The node Image to List can be used to convert the labeled image into a list of images. Use the configure menu to select “from labels” to do this conversion.

As we can see in the preview window below we have a list that contains many images. Each entry in the list is an image mask that is true only for one single object (as defined by the unique ID’s given by the labeling operation). We will use these images as the inputs to our classification algorithm to detect the individual objects.

#### Summary

In this post we have looked at the segmentation problem and shown how simple thresholding or edge detection algorithms can be used together with morphological operations and labeling to create a list of objects in an input image. This list of consists of a mask singling out each individual object in the image, one at a time. In part 2 we will continue to perform the classification of each found object.

Märta, why did you choose engineering?
I have always been interested in technology, wanted to know how stuff works. I also liked math and physics and thought it was kind of easy. In gymnasium I first planned to study natural science but ended up choosing more technology-oriented classes since the combination of math and reality was tempting. I think that might also have been a reason to why I focused so much on control theory.

What was the best part of your engineering studies?
Without a doubt my time spent as an exchange student at University of California Santa Cruz!

That sound like a great experience!
Yes, it was fun to take other courses than what was available at Lund University. I also got the opportunity to work in the Autonomous Systems Lab, playing around with robots and drones. This was very valuable since it was like a mix of working, studying and doing research. California is also such a great place so besides studying I spent a lot of time surfing and skateboarding.

Autonomous drones sound like the optimal way to apply math in reality. How did you move on from that?
Well, after California I returned to Sweden in time for my master thesis. Since I had spent quite some time working with autonomous systems and drones I wanted to do my thesis in that area. With that said, I was thrilled when the perfect project was available at SAAB.

It was about controlling a swarm of autonomous flying drones. Having multiple drones in a swarm leads to many interesting problems ranging from internal distance estimation between the drones to the high-level behavior of the swarm.

So now you work as an engineer, is it all you thought it would be?
Well, I never really had any clear picture of exactly what it means to “be an engineer”. It wasn’t until the final years at the university I started to get a better picture of what it means. But yes, I work with applied mathematics every day so in that sense it is what I envisioned.

How does a typical day at work look like?
I work in an agile environment, kind of like scrum-ish… The day starts with a daily scrum meeting where we go through what we work on and potential issues. After the meeting it’s time to start work on my current tasks. Right now, my main focus is on PLC programming, coding new features and testing them out at the machine or in a virtual environment. Some time is also spent on developing the virtual test rigg, bug fixes etc. My days are very flexible, and I control a lot of the time myself and that suits me perfect.

Why did you choose Combine?
I started my career at a larger consulting company. I liked the role of a consultant, but I felt that I wanted to work for a company more focused on the technologies I’m interested in. I had also heard good things about Combine from friends.

Do you also want to work with applied mathematics and control systems development as a consultant at Combine? See if we have any available positions, or just give us a call and see if we have something coming up soon.

For todays post we have a guest author, Lia Silva, that works in Data Science and have her own blog https://statsletters.com/ on mathematics, statistics and other fun stuff. Without any further presentation, see what Lia have to write about studying our relationship with consumerism through graph theory and statistics:

One important difference since a decade ago is what can be measured about how a user consumes a product. Lately, it seems like every little thing that we do can be used to build a projection of ourselves from our habits. And in the end, that projection can be used to poke the reptilian parts of your reward circuitry so they release the right cocktail of hormones. A cocktail that makes you choose bright red over dull gray, reach for your wallet or click “I accept the terms and conditions”.

Through the years, recipes for such cocktails have been perfected by different disciplines. As an educated consumer, actively tasting those recipes in modern products can be as interesting as wine tasting, minus the inebriation. This is the first of a series of posts intended to help you be more aware of your own reward circuitry by using interpretations that different algorithms build from observing your measurable actions.

Another intention with this series of blog posts is to show that the methods are not necessarily:

• Absolute
• Inherently objective
• Infallible

And definitely NOT suitable to use blindly e.g. “press the Analytics Button and have the neural network tell me everything”*. If anyone promises that without disclosing any assumptions, make sure to ask LOTS of questions.

The “Serpent people” series will present some textbook representations suitable for modeling this problem, aspects that are better reflected on each one of them, and trying out different open-source libraries on different artificially generated models of “people according to what we know about them”.

Some of that material is already in a very fluid shape in this notebook if you can’t wait to play by yourself :). The representation in there is simply what I considered to be natural for the problem itself. I plan on elaborating that representation with classics such as Frequent Itemset Mining and Associative Classification. For those, you can start by checking out Chapter 10 of “Data Mining” by Mehmed Kantardzic.

And that’s the teaser for what will come. For now, I will leave you with this David Bowie Song. Granted, it’s “Cat People” instead of “Serpent People”, but pretty cool still.

## The basic idea

The basic idea we have for reading the time (or any other analog device!) is to first capture an image of the device using a web camera and use the image processing tools in Sympathy for Data to read the hour and minute hands from the clock. Our goal is to do this with only the open-source image processing tools that are included in Sympathy for Data, and no custom code.

Our aim is not to create a solution that works for any image of any clock, but rather to show how we can reliably read the values of the given clock given a very specific camera angle and lighting situation. To do this for any clock and lighting setup requires more advanced techniques, often involving machine learning, and which harder to guarantee that it works in all of the target situations.

For the purpose of this blog post, we will not be using machine learning or any other advanced algorithms but rather rely on basic (traditional) image processing techniques. This problem resembles very much that of designing image processing algorithms that are used every day in industrial production — where you have a highly controlled environment and want simple and fast algorithms.  By controlling the environment we can make this otherwise complex task very simple and easy to design an algorithm for. When doing image processing for industrial purposes it is commonly found that you can simplify the problem and increase robustness by heavily controlling the environment and the situation in which your images are acquired.

For the impatient, you can download the dataset with images that we prepare in the first part as well as the finished Sympathy for Data workflow.

## Acquiring the data

Step one is to acquire some images of the clock and to try to analyse them in Sympathy. If you don’t want to repeat these steps yourself you can simply download the full dataset from here.

We start with a naive approach and just place the web camera in front of the clock and record images once per minute over a full day. Some example of these images are below:

What we can see in the images above is two main problems: the lighting varies widely depending on the time of day, and the lighting in the image itself varies widely due to specular reflections and makes part of the hour-hand invisible second image above. Whatever algorithm we come up with will have a hard time to deal with an invisible hour-hand.

We can also note that with a stronger and directional light we would have shadows cast by the hour and minute arms of the clock is projected at different part of the face of the clock depending on the incoming light direction. If we where to directly try to analyse these images we would need to compensate for the shifting position of the shadows, and we could not use any simple thresholding steps since the overall light changes widely. Any algorithms we create for these kind of images will inherently be more complex and possibly more prone to failures when the light conditions change. We would need to test if over a wide range of conditions (day, night, sunset, rainy weather, sunny weather, summer, winter) to be sure that it works correctly under all conditions.

An attempted first fix for the specular reflections was to remove the cover-glass of the clock, the idea being that the face paint of the clock wouldn’t be reflective enough to give these issues. This however wasn’t enough to solve the problem with disappearing arms for all lighting conditions, and are we to apply this idea to an industrial setting it may oftentimes not be possible to make such a modification. A better solution is therefore to remove all direct light in favour of a setup with only diffuse lights.

To solve both these problems we choose to make a controlled environment with only diffuse light and where we know that the only thing that changes are the positions of the clock’s arms. We force a constant light level by enclosing the clock and camera in a box with a lightsource. This way we can also eliminate the the shadows of the clocks arms by placing the lightsource from the same direction as the camera.

## Building a camera friendly light source

In order to eliminate the shadows from the clock arms and to provide a even and diffuse lighting conditions we place a ring of white LED’s around the camera. Usually this is done using professional solutions such a light ring for photography, but we’ll manage with a simple 3d-printed design and some hobby electronics. You can find the downloads for these over a Thingiverse.

The design for this light is simple ring where we can add the lights plus a diffuser on top of it to avoid any sharp reflections.

After printing the parts above we place a number of white LED’s in the small holes in the middle part above. By twisting the pins together with each other on the underside (take care with the orientation of anode and cathode!) we can easily keep them in place while at the same time connecting it all up. See if you can spot the mistake I did below in the wiring. Fortunately it was salvageable.

Next step is to place the diffuser over the LED’s and to attach it onto your web camera. Power it with approximately 3V per LED used. Point it straight at the clock and put an enclosure over it. We can used a simple carton box as a simple enclosure that removes all external light.

Congratulations, you can now get images of the clock with perfect lighting conditions regardless of sunlight, people walking bye, or any other factors that would complicate the readings.

## Pre-processing the images

When doing image processing it is common to operate on grayscale images unless the colour information is an important part of the recognition task. For this purpose we first run a pre-processing step on the whole dataset where we convert the images to greyscale and downsample since we don’t need the full resolution of the camera for the rest of the calculations. This can easily be done using Sympathy for Data.

First create a new flow and point a Datasources directory to a copy of the dataset (we will overwrite the files in place). Add a lambda node by right clicking anywhere in the flow and select it. Connect a map node to apply the lambda for every datasource found.

Before you can run it, add the nodes below into the lambda node to do the actual image conversion. You need to select greyscale in the configuration menu of the colour space conversion node, and rescale X/Y by 0.5 in the transform image node. Note that the “save image” node here overwrites the images in place, so try not to run the node until everything is ok. Another option would be to compute a new filename to be used instead using eg. a datasource to tables node and a calculator node.

In the dataset that you can download we have already done these conversions (downscaling to 800×500 pixels) to save on bandwidth.

## Analysing the images

Our goal in this section is to create a Sympathy workflow that allows us to take any image of the clock and convert it into an hour and minute representation of the time.

### Creating a template

For the first step we want to extract only the arms of the clock that should be analysed. For this purpose we will use a practical trick to easily detect only the moving parts of the image. We do this by first calculating a template image that show how the images would look if all the moving parts were removed. This trick only works when the camera is fixed and there are no major changes in overall lighting.

Start by calculating the median (or in our case max since we know the arm’s are black) of a few different images from the dataset. This will give an image where the arms of the clock is removed.

Select max as operator in the Overlay Images node below. This works since we know that the arms are darker than the background, and whichever pixel has a light colour in any of the images will have a light colour in the final image.

The results look surprisingly good given that we only used a few images (where the arms where all in different positions):

We save this template in a separate file so that we don’t have to redo the calculation for each image that should be processed.

### Extracting the minute and hour arms

Continue by creating a new workflow and load one of the images to be analysed. We start by making a subtraction of the image to be analysed from the template image.

This will give an image where only the arms are visible, you will need to select “subtract” as the operation for the “overlay images” node.

The next step will be to perform a threshold to pick out only the arms of the clock as a binary image. To do so we use a Threshold node and set it to basic threshold. We can figure out a good threshold level by looking at the histogram above of the image after subtraction. We see that the maximum value of the image is 0.55 and that something significant seem to happen around the 0.4 mark (note that the graph is logarithmic!).  We set the threshold to 0.35 and get the results shown above.

Since there can be some small smudges and missed spots on the binary image we apply morphological  closing on the image using a structuring element of size 20 which should be more than enough to compensate for any missed pixels caused by noise, scratches on the object/lens or the otherwise black areas of the image.

Finally, we can note that we only actually need to see the tips of the hour and minute hand in order to read the time. If we sample to check for the the minute arm at every point in a circle with a radius closer to the edge of the picture we can know which pixels belong to the minute arm. Similarly, if we sample every point in in a smaller circle we get one or two sets of pixels corresponding to the hour arm when it is below the minute arm or  both the hour arm and minute arm when they are not overlapping.

We can do this sampling by first creating two new templates that we use to select a subset of the pixels. This is done by drawing a white ring on an otherwise empty image for each of the two selections. We can do this using the Draw on image node and a Manually create table node that gives the XY coordinates (416, 256) and radius 200 for a circle with colour 1.0 and 170 for a circle with colour 0.0 — corresponding to the ring selecting the minute arm below:

After multiplying these two templates with the thresholded image (using again the overlay images node)  we get two new images with blobs corresponding to the tips of the hour and minute hand:

All that is left now is to extract the coordinates of these blobs and to apply some math to convert them into hours and minutes.

### Computing the minute

We can compute the position of the minute hand by using a Image statistics node with the algorithm “blob, DoG” with a threshold of 0.1. This algorithm finds “blobs”, or light area on a dark background, in an image by subtracting two low-pass (gaussian) filtered version of the image filtered at different scales.

All other parameters can be default, but the default value for threshold of the difference-of-gaussian algorithm is too high for our inputs.

Now all we have to do is to convert the XY values 443, 426 of the tip of the minute hand into an actual value in the range 0 – 60. We can do this by calculating the vector from the center of the clock determined from the raw image as (416, 256) to the point of the detected blob. This gives us a vector (187, 10). By taking the arctan of this vector in a calculator node we can get the angle to this point and convert it into minutes. Note that we invert the y-component of this vector to compensate for the difference in coordinate systems (y-axis in images point down):

### Computing the hours

In order to compute the hour we need to eliminate one of two possible candidates for the hour. Consider the blobs shown below, from just this data it is hard to know which hour it is:

However, what we can do is to take the position of the minutes that we calculate above and clear out one of the two blobs above. Since we know the radius that we used for the circle multiplied with the data, we can easily draw a black area on top of the location where the minute arm is located and at the given distance from the centre:

For this purpose we use another calculator node to compute the X/Y coordinate above and to draw a black circle onto the image at that location. For clarity it has been drawn as a brown circle in the example above to see what area of the image is deleted.

We extract the hour value from the remaining blob, if there is one, similarly as to how the minute value was calculated. Note that if there is no blob in the image containing the tip of the hour arm then the expression belows gives a NaN value.  This happens when it is under the minute arm.

Finally, we can finish the flow by adding in a special case calculation that checks if the ‘hour’ column has the NaN value and if so instead derive the hour position from the minute position. In this step we also round the minutes to even number and round the hours down to nearest smaller integer. Note that we subtract a fraction of the minutes from the hours before rounding due to how the hour arm moves closer and closer to the next hour as the minutes raise.

We also subtract 1 (modulo 60) from the minute position since the captured images where all slightly rotated clockwise. We could have compensated this in the original pre-processing if we had noticed it earlier.

## Time to check the results

Before we are happy with the flow, let’s check how well it performs versus the ground truth. Since the timestamp was saved when each image was captured we can easily compare these values with the results of the flow. Due to the sampling process and since the seconds of this clock wasn’t synchronized with the seconds of the computer sampling them — we should expect to be off by one minute in some of the readings.
As we can see in the table below we successfully read the time, with a difference of at most one minute, for first 100 images.

By only allowing ourselves to use the first 50 images when we developed the flow, and then validating it by running on a larger dataset we gain confidence that the algorithm works for all the situations it will encounter. We have run it the full dataset of 700+ hours without any other errors.

Sympathy for Data is a visual software tool that helps to link scripts and data between different systems and enables analysis.

The tool targets all domains where data analysis is carried out off line and is challenging and repetitive. Our focus is on automating the tasks of importing, preparing, analyzing and reporting data. In some respects, it is also a visual programming environment.

Sympathy for Data is used by a number of leading companies in different industries, such as the automotive, process and automation industries. At Combine we believe that automation is the key to long-term success, yet a large proportion of current analysis is done manually. This is no longer sustainable as data continues to grow in every respect. Sympathy for Data supports subscription services to databases, network disks, etc. It can find, filter, sort and analyze data as well as create automatically formatted reports.

Applications
Among other things the tool allows you to:
– Transfer data from and to all sources (DB, web & remote, local).
– Move your favourite scripts to nodes and turn them into an analytical flow.
– Easily share data processes across an entire organization.
– Run data processes in batch mode to enable scheduled reporting.
– Easily share data from a data process with web reports or other chosen formats.

One of the aims of Sympathy for Data is to allow users to manage data from many different sources and in different formats. You can connect the tool to databases, select files from different network devices, or analyze thousands of Excel files automatically. We have developed the tool to manage large amounts of data along with other forms of useful data, such as metadata (units, dates, etc.), results and other custom data needed for analysis.

A tool for the entire organization
Sympathy for Data can easily be used throughout an organization, thanks to the flexible configuration options. The tool can be deployed as a standalone application, integrated in a server environment or combined with other enterprise solutions such as SQL (Server Integration Services), SharePoint, etc. The purpose is not to lock users into the platform, but instead to serve as an interface between different systems, data formats, etc.

### 1. Grabbing that cup of coffee

In the figure below, a two link robot arm is depicted. We say this depicts your arm. The upper
arm has the length r0 and the forearm the length r1. The position (x; y) of your wrist s described
by the two angles q0 and q1. Your wrist is currently located at the red dot. Say we want to the
grab a cup of coffee that is located at the blue dot. Pretend that you are a robot being controlled
by a computer giving commands to your upper- and forearm. What should the angles q1 and q2
be to achieve this? This type of problem is usually called the inverse kinematics problem. First,
we note that there should be two valid solutions, one with the “elbow” up and one with it down,
right? However, assuming this is us grabbing a cup of coffee, our arm is imposed to kinematical
constraints meaning that the elbow can only be down. I skip the equations for this, but they
can be found in for instance Robotics by Bruno Siciliano et.al.. The solution for this problem is
analytical, but however not completely trivial I would say.
Let’s just say we found the angles q0 and q1 and full of confidence we are to grab our cup of coffee.
Preferably we should move along a straight line, because we have a computer screen in front of
us. However, these angles say nothing about how we reach our cup of coffee. If we just first set q1
to our calculated target value, and then q0 the robot arm end point might move along the dashed
line or something similar. We would knock down our screen and maybe even that cup of coffee in
the process, which is all but a good start of the day.
So, we don’t just want to move to a certain point of interest, we usually want to reach that point
while being constrained to a specific trajectory. A common way in the digital world to do this
is to solve the inverse kinematics in very small steps, where all small steps adds up to a straight
line. This implies that the angular velocities q_1 and q_2 should be coordinated so we get a smooth
movement. A challenge here is that the dynamics of your arm is highly non-linear. Even if your
arm were actuated by ideal electrical motors, the equations of motion could well fill up a whole
page. In this case your arm are actuated by various muscles, which further adds complexity.
Then, we probably want the Cartesian velocity (x_ ; y_)T along the straight line to fulfill some
criterion. Probably we want to move a bit faster at first and slower when we reach the cup so we
can slow down in time and not spill coffee all over the place. This implies that we should want to
employ some trajectory planning. In the end we have a trajectory involving joint space variables
q1; q2; q_1; q_2 as well as Cartesian space variables x; y; x_ ; y_. So this starts to sound at least like a bit
of a hairy problem, eh? Then, add other challenges as that maybe we cannot even reach the blue
dot given that our arm is too short? Or that there is a bowl of cereals, a computer screen or some
other constraint in the way that we need to circumvent.
Then let’s say we add another link to your arm. Now we have three joint variables q1; q2; q3. Having
three input variables and two output variables (x; y) we can suddenly have infinite solutions to the
inverse kinematics problem. This is called a kinematic redundant manipulator. Off course, your
arm probably doesn’t have three links (would be quiet cool though?). But, your arm has far more
degrees of freedom than depicted here in 2D. You can tilt your forearm, upper arm and what not.
Each and every of your fingers are even more complex than the planar manipulator depicted. For
this 2D problem, your arm is kinematically redundant and we couldn’t find a closed form solution
for the inverse kinematics problem. We have to select some of the infinite solutions according to
criterion. A way to do this is to formulate some form of optimization problem. Machine learning
is also being employed sometimes.

Figure 1: A figure used to derive some basic ideas related to inverse kinematics

### 2 Kinematics and the evolution

Are you convinced now that grabbing a cup of coffee is maybe not as trivial as you might have
thought? Back to the philosophical question, what makes us human? For one, we are an animal
with relatively high intelligence compared to others creatures we know in this world. Without
dwelling into the subjective definition of what intelligence is, I think we can all agree on this. This
is what enabled us to develop tools helping us to gather, store and process foods among other
things. It encompasses all from weapons such as spears and archery, agriculture, taming fire and
so on. But what would these ideas concepts be without our ability to physically manipulate the
world? Agriculture and fire are merely theoretical pipe dreams when lacking some sort of ability to
achieve this things in practice. Surely, thinking might be existing. But you couldn’t think without
eating. Not in the way humankind exists today anyway.
This is where our advanced kinematics come into play. Advanced kinematics control is as crucial
to define humanity as we know it, as our intelligence I would say. You as a reader, might
not be dependent on your ability to make fire in order to survive in this world. Perhaps clacking
the keyboard of a computer is quiet enough. Which is applying kinematics control. But Stephen
Hawking could do without it you say then. Sure, he personally, yes. But those who built the first
computers? Those who scrabbled down the theorems necessary to build the first computer? Nope.
Then, say we all would communicate and operate like Hawking’s did. You would still at least
need to eat, right? Even if you could communicate with merely your mind, you would still need
to somehow run agriculture. Which again, means physically manipulating the world. It seems
inevitable that our existence is depending completely on our ability to physically manipulate the
world.
Let’s assume that the evolution made us interested in things that were beneficial for our survival
and reproduction in various ways. This is probably where sports come in. Apart form competition,
it is a way of refining our kinematic abilities. Spear-throwing, wrestling and boxing can probably
be directly related to the need of defense and hunting. A tribe being good at handball or a precursor
to it, would probably also be better at throwing stones at both prey and attacking enemies.
Those playing football would probably catch a hare better. Most culture employ dancing. While
dance can fulfill a variety of purposes, one is certainly displaying reproductional benefits. It could
be a way showing off genes being able to handle kinematics well. Do we have complex robots?
Yes. Does anybody of them dance well yet? Well, no. A robot that could dance well would thus
arguably be more complex and all robots so far known and would be able to perform many other
complex tasks then dancing. A person showing off some smooth moves at the dance floor basically
communicates “hey, my gene-pool is probably super-good for a variety of tasks in this world that
is beneficial for our existence. Good from all to hunting rabbits to climb trees”.
2
So, smooth and advanced kinematics is not a of interest for engineers and such only. The interest
for it is probably even coded into the DNA of each and every human being. It is fascinating
how we as a humanity, consciously or not, many times happen to mimic features and stages of the
evolution.

Controlling the temperature during beer brewing is essential for the quality of the end product, as well as ensuring an efficient production. By eliminating manual control in the production of beer a more consistent product can be produced. During this thesis project the temperature control strategies for a heat exchanger and the fermentation process has been modelled and developed.

During fermentation of beer, a brewmaster wants to control the temperature at which the fermentation occurs with great precision. Not only does the temperature need to be steady, different temperatures are needed during different stages of the fermentation process. With a precise controller capable of ensuring a unique temperature profile the brewmaster can repeatably create the best tasting beer possible. This is due to the temperature during fermentation affects the flavour profile of the finished product. With the additional benefits of being able to monitor the progress of the fermentation through the online monitoring system, being present at the brewery is no longer needed for routine checks of the beer.

By modelling the biological behaviour when yeast fermenting a variety of different sugars into ethanol and carbon dioxide, precise control of the temperature is achieved. Implementation of a mathematical model based on a modified version of the equations derived by Engrasser was used in Simulink.

During the beer brewing process, the last step before the fermentation starts is the cooling of the wort. This is done by pumping the boiling hot wort through a heat exchanger. By modelling the thermodynamical system a controller could be developed. It is of importance to ensure that the wort is at a precise temperature when the fermentation starts, in order to give the yeast the best possible environment. By implementing our controller the flow used in production can be increased as much as 65%, with the added benefit of a consistent and predictable temperature in the fermentation vessel, eliminating the guesswork from manual control.

Assume that we have a mass. Its purpose in life is to move from one point to another in a two-dimensional space. It can do this by applying a force in any direction as long as the magnitude of the vector is limited.

The mass is obliged to visit two points on the way while it is not allowed to violate the laws of motion.

The mass has a maximum of 50 seconds to fulfill its task as quickly as possible while the total energy consumed is minimized. The laws of motion of the mass are included as constraints when the optimum is defined as:

$$\min_{u_x(t),\,u_y(t)} w\,\underbrace{e(t_f)}_{\text{energy}} + (1-w) \underbrace{t_f}_{\text{final time}}$$

$$\begin{array}{rcll} \text{such that} & & \\ t_f & \leq & 50 & \text{final time} \\ u_x^2(t) + u_y^2(t) & \leq & 1 & \text{maximum force} \\ x\left(\frac{1}{3}t_f\right) & = & 0 & \text{first waypoint} \\ y\left(\frac{1}{3}t_f\right) & = & 1 & \\ x\left(\frac{2}{3}t_f\right) & = & 1 & \text{second waypoint} \\ y\left(\frac{2}{3}t_f\right) & = & 0 & \\ \dot{x}(t) & = & v_x(t) & \text{equations of motion} \\ \dot{v}_x(t) & = & \frac{1}{m} u_x(t) & \\ \dot{y}(t) & = & v_y(t) & \\ \dot{v}_y(t) & = & \frac{1}{m} u_y(t) & \\ \dot{e}(t) & = & u_x^2(t) + u_y^2(t) & \\ \end{array}$$

Since we have two goals working against each other, the parameter $$w$$ is used to weight the importance of the two terms. There is a trade-off.

This problem can be solved using Pontryagin’s Principle. In this case we are using the numerical solver ACADO and the problem is solved for $$w \in \left\{ 0.0,\,0.1,\,0.2,\,\dots,\,1.0\right\}$$.

The set of trajectories for different $$w$$ become

where the widest trajectory minimizes the time (high velocity) and the trajectory with the tightest turn minimizes the energy (low velocity).
The Pareto Front is

Here we see that a good trade-off might be somewhere in the region between 25 and 30 seconds since the energy does not change much for a change in duration and vice versa.

Optimal control is incredibly powerful, but it can also be quite difficult to solve complex problems. The formulation of the model needs to be correct and the constraints must be formulated such that a solution exists. For complex problems, the solver needs to be given a good initial guess of the solution, otherwise, it might fail to find a feasible solution at all.