Blog - Page 7 of 10 - Combine | Combine

# Blog

A cat does not like when its environment changes drastically. It might be stressed and start urinating on furniture to protest. Humans may not behave in the same way, but they have ways of showing discontent both directly and indirectly.

To transform an organization we need information to be able to act objectively. A company has a mixture of structured and unstructured data from which historic events up to now and how well we performed can be extracted. This is where most companies stop. Going further requires much more work.

The first step is to find out why we are where we are. Going here requires data-crunching, deduction, and discussions with domain experts. The results are reported to the management and we are done. The natural continuation is to find out what would happen if the historic trends would continue in the future and what to expect. Once we know all of this we need to find out what our next steps should be and decide on some action. An action is a mutation of the current state of affairs and requires a change, either small or big.

The change could involve minor adjustments which fit within existing processes (easy changes) or major changes which would disrupt existing daily patterns of the employees of the company. People are driven by their interests, and if their interests are harmed by change they would try to hold on to or increase their influence according to action theory. They could do so by (Learning to Change, Caluwé & Vermaak):

“…behaving unpredictably; by concealing information or distorting it; by imposing rules for the game, or, on the contrary, simply ignoring them; by forming coalitions; or by blackening somebody’s reputation.”

There are both formal and informal organizations where the latter is undocumented and constantly changing over time. To make people change (not just the formal organization), Caluwé & Vermaak discusses seven different ways to change things of which the last two are inappropriate to use in professional settings.

1. Yellow: Change using power and processes to get everyone on the same wavelength.
2. Blue: Rational change using blueprints with a given outcome (waterfall design).
3. Red: Change using inducements and/or penalties.
4. Green: Change by letting people grow through education and learning.
5. White: Change through self-organization and evolution.
6. Steel: Change using violence and repression.
7. Silver: Change through circumstances (“if God wants”).

Which method (among the first five) to choose (or which to combine) depends on the company, the individuals within the company, and the culture. Data based change might often end up in the blue category, but depending on the conclusions made from the data and how much impact the change has on humans and company structures (both formal and informal), blue might not be the best choice. The conclusion is to never forget the humans involved because if the basic needs and relations of humans are disrupted major discontent might be the result. Pure rationality and objectivity do not always work out as expected.

It seems for me that it is a long way to travel to Sweden to do a master’s in engineering, why did you choose Sweden?
Yes, that is correct. The interest to do a master’s started out during my career as a hardware design engineer back in India. For most part of my bachelor’s in electrical and electronics in India I was interested in projects which was more hardware oriented. Right after graduation I did get a job to assemble/test electric control units (ECU) for solar inverters with a Swedish electrical company, ABB. During this time, I also became curious about how the software in the ECUs was created. Thus, I took the decision to chase a master’s degree and Sweden was close at hand due to its good universities and the positive work culture.

Why did you choose Combine?
Well, when I reached the end of my studies I got involved in several hiring processes and few of them went far. It was the employees’ market at that time and the lack of engineers was striking. I bumped into Combine at a job fair for graduating students and thought that the guys in the booth was both social and technically experienced within the fields where I wanted to make a difference. Moreover, during the hiring process, I did get a completely different experience. The process was completely an eye-opener for me since the manager who was to hire me, forced me to answer difficult questions like, ‘What is your background?’, ‘Why did you choose engineering?’, ‘What do you want to do in your career?’ Those questions created a feeling that this company really cared about their employees and that feeling is still there.

Tell me something from your assignment that you are proud of?
In my assignment, I do work with software/model-in-the-loop testing and how to streamline the testing process. In the beginning, I did tests by hand which was a tedious process and not creative enough. It didn’t take the team long before we started looking for other approaches to do tests more effective and increase the test coverage of the software. After some time, we stumbled upon an automatic testing framework and adapting this was both interesting and challenging. During the last year, we have tried with the help of Python and existing tools incorporate an automatic testing framework in a larger scale than what I have seen before. It was an intriguing task to set the whole framework up to work and I am really proud of the teamwork and co-operation during setting the framework up. Now my work is focused more on setting up the test requirements individually rather than the framework for the test itself.

Is the work as an engineer what you had envisioned?
Well, I didn’t have any clear idea of what an engineer should do. With experience, a picture started to take form that an engineer is someone that solves problems and issues. The problems do vary in both shape and size but yes that is what I and my fellow colleagues do every day. I really enjoy what I do and would love to keep facing more challenging tasks in the future.

You have been in Sweden for a few years now. What do you think about Sweden?
Well to begin with, I am extremely fond of the ‘fika’ culture in Sweden. It’s such a nice custom where in colleagues get together every week, just to have some cakes and coffee! Moreover, I really enjoy working in Sweden because of the support, flexibility and co-operation that you enjoy at your workplace.  The weather can be difficult at times, but after three winters in Sweden, I don’t feel it is going to bother me anymore.

Earlier this month WARA PS, a part of the WASP arena, demonstrated an autonomous search-and-rescue system used at sea. This was done with one UAV automatically searching for people in the water while transmitting aerial images to the rescue center. An autonomous boat followed by a second UAV showing the situation closer to the water surface was guided by the first UAV when a person , in the water; thus making it possible for the person in the water to be saved by the boat.

Almost the same procedure is used when finding people on land. One UAV systematically scans a predetermined area for missing people. If a person in need is found, a second UAV is contacted which flies to the destination and releases a help package, such as a phone or medicine. All of this is performed automatically.

In January this year, a UAV dropped lifebuoys to swimmers in Australia who were in trouble, possible saving their lives.

But not only human lives can be saved by UAVs. Project Ngulia who aims to develop technical solutions to monitor rhinos and thus combat poaching are, among other solutions, looking at UAVs to support park rangers. This summer, a new agreement spanning over three years was signed between the Kenya Wildlife Service and Linköping University. Hopefully, UAVs can become a cost efficient and sustainable solution saving the black rhinos from extinction.

Some of the technologies used in the project are tested at Kolmården Wildlife Park, which is located in close proximity to the technical team at Linköping University while offering a realistic savannah. The test site also renders data that are not highly classified, which is the case in real parks and sanctuaries.

These are just a few examples of ongoing projects that uses UAVs to save lives. However, the sectors of UAV usage have increased greatly. They are now used in marketing, professional film making, construction, delivery, imaging, agriculture, family occasions, entertainment, inventory, weather forecast, environment, insurance, policing, sports and more.

The development and enhancements of UAVs are of great interest of Combine since it contains problems in one of our main fields of expertise. But equally interesting is how the UAVs are used, as saving lives and protecting wildlife are important steps in creating a better planet.

## Introduction

At Combine, we play board games during so-called “Game Nights”. On several occasions, there have been discussions regarding how to shuffle cards efficiently (i.e. having an unpredictable order of the cards).

A deck of ordinary playing cards consists of four suites of 13 cards each giving a total of 52 cards. The total number of permutations of the card deck is given by:

$$P^n_r = \frac{n!}{(n-r)!}$$

where $$n = 52$$ is the total number of cards and $$r = 52$$ is the length of the sequence we want to generate from the $$n$$ cards. Since $$n = r$$ we obtain $$n! = 52!$$, which is approximately $$8 \cdot 10^{67}$$ combinations.

How to shuffle cards have been studied before and also discussed in other ways, and these sources have been used as a foundation for this text.

Given a deck of 52 cards, each card is numbered from 0 to 51 in order ($$F_i = i$$). After shuffling the deck we then know the id of each card in the new order. The Shannon Entropy is then calculated by first estimating the distribution of distances between each card in order:

$$\Delta F_j = F_{j+1} – F_j$$

The Shannon Entropy is then calculated using

$$E = \sum_{j=0}^{51} -p_j \log_2(p_j)$$

The variable $$p_j$$ is a normalized histogram of distances between cards.

The maximum possible entropy is $$\log_2(52) = 5.7$$ (measured in the unit “bits”), which is useful as a reference.
According to literature, it is enough to cut the deck and riffle shuffle seven times to obtain an unpredictable order.

## Overhand Shuffle

Not everyone is able to perform the riffle shuffle and might instead use the overhand shuffle. We experimented with the overhand shuffle, wrote down the order of the cards for each shuffle, and ended up with the following increase in entropy (the red line is the maximum possible entropy).

The first iteration does not increase the entropy at all since the deck was only cut once without any shuffling taking place.

## Hash Shuffle

One idea which was discussed at one Game Night was to shuffle cards using a method similar to how hash values are calculated in computer science. This is a deterministic shuffling method, but by adding some random elements, like cutting the deck, we get some interesting results.

The idea is to choose a number of piles and divide the cards between them. This would force cards to interleave with each other, introducing a distance between them. Just doing this once without any random elements gives the following entropies for different numbers of piles:

If we apply the hash shuffle twice and try all combinations between 2 and 10 piles we find that using 5 piles to start with and then 5 piles a second time again gives the highest entropy. In practice, you should cut the deck between each operation as well.

If you want to repeat the hash shuffle for a fixed number of piles you should obviously avoid 2 and 4 piles.

## Riffle Shuffle

The riffle shuffle is claimed to be one of the best ways to shuffle a deck of cards. And, indeed, given the Shannon Entropy measure the riffle shuffle is by far the best way to shuffle according to the following figure:

The entropy rises very fast. Using other measures it is claimed that 7 riffle shuffles should be enough, and more than $$2 \log_2(52) = 11.4$$ shuffles is not necessary.

## Conclusion

When shuffling you should use the riffle shuffle. Just make sure that you cut the deck between each shuffle since the top and bottom cards tend to get stuck otherwise.

In recent years, the hype around artificial intelligence (AI) has grown a lot. AI is not a new concept and the term has been around since the 1950’s when computer scientist John McCarthy coined the term. But the concept of creating artificial beings has been around much longer and can be found in greek mythology or in Mary Shelley’s Frankenstein.

AI has also been portrayed in numerous books and movies and some are more interesting than others. There are great classics like Stanley Kubrick’s 2001: A Space Odyssey (1968) inspired by a short story by Arthur C. Clarke. In the movie the ship’s computer HAL (Heuristically programmed ALgorithmic computer) 9000 is the main antagonist. This is one of the first movies that portrayed the idea of a human made AI to the masses.

Another classic movie is Blade Runner (1982) based on Philip K. Dick’s novel Do Androids Dream of Electric Sheep?. In the movie humans have designed intelligent androids, called replicants, that the main protagonist, Rick Deckard, must hunt down and retire (terminate).

One common problem in movies, and sci-fi movies in general is how to balance an intricate and interesting story with special effects and action scenes. The sci-fi genre has so much potential when it comes to explore concepts about science, technology, existentialism, human evolution and the possibility of extraterrestrial life. When the boundaries of our current paradigm is not a hard limit, imagination and science can blend in the most interesting ways. The movies and TV shows below are not based around special effects or action but rather trying to explore what it is to be human.

#### Her (2013)

(image from themoviedb)

Director: Spike Jonze
Starring: Joaquin Phoenix, Amy Adams, Rooney Mara, Olivia Wilde, Scarlett Johansson
Genres: Sci-fi, Romance, Drama

Synopsis: Joaquin Phoenix plays a man who installs a new operating system with artificial intelligence to help him with various tasks and over time their relationship becomes more and more romantic.

Being able to convey a romantic relationship without the need of a materialized body, using only a voice, is beautifully executed. Most other movies that explore romantic human/machine involvement are doing so almost by cheating. Using an android, more or less indistinguishable from a human, makes it much easier to relate to. Human/computer interaction by voice is not something new and has been around for a quite some time, mostly as an accessibility tool for people not able to use standard equipment. In October 2011, Siri was launched (by Apple) as the first smart phone integrated voice controlled virtual assistant. At the time, the actions it could perform was rather limited, problems understanding voice input and could be seen as more of a gimmick than a valuable tool. Since then other companies have released voice controlled virtual assistants and the biggest competitors are Amazon Alexa released in 2014 and the Google Assistant released in 2016. These products are getting better and better and voice controlled virtual assistants are probably here to stay although they are not perfect.. yet.

Her, artificial intelligence and the concept of time

Humans, like animals are evolutionary equipped to experience some basic aspects of time. What happens when an artificial intelligence emerge with the computational power of living years, decades or millenniums every day, hour or second? Their concept of time will be something completely different which could have huge implications. For some perspective, OpenAI Five  recently competed with five bots against five former professional gamers in the competitive game Dota 2. Everyday the AI played 180 years worth of games against itself running on 256 GPUs and 128 000 CPU cores. When summing up each character (a total of five) it amounts to 900 years every day. Time is relative but even more so for an AI running on a giant cluster and there is a great scene about this in the movie.

#### Ex Machina (2014)

(image from themoviedb)

Director: Alex Garland
Starring: Domhnall Gleeson, Alicia Vikander, Oscar Isaac
Genres: Sci-fi, Drama, Mystery, Thriller

Synopsis: Meet Caleb Smith, a talented programmer at a large tech company, who wins an office competition to stay a week at the CEO’s remotely located house. When he arrives he is introduced to an android who according to Nathan Bateman, the CEO, has passed the Turing test but he wants more validation before going public with the news. Surprise, surprise.. Caleb starts to develop feelings for the android.

How close is humanity to develop an human-like conscious AI and what happens then?

What happens then?, is a question people have been debating for a long time. Some view this as the inevitable future and hope for the human race and others see it as humanity’s demise. This has also been explored in sci-fi books and movies over the years and The Terminator (1984) portrays one of the darker scenarios for mankind. But how close are we to develop an AI able to pass the Turing test? Probably not very close. Even though great progress has been made in the last few years, with everything from AlphaGo being able to beat the top players at Go to Libratus a poker AI able to beat top players in heads up no-limit Texas hold ’em (and of course the OpenAI Dota 2 bots mentioned earlier), the Turing test is something else to beat. The AI’s created today are heavily specialized and contextual but can’t do much outside their specialization. And in order to pass the Turing test an AI would have to behave like a human when exposed to a multitude of different questions that could range from anything to everything.

The idea of sitting down with an AI, like Caleb, having a conversation and slowly become more and more amazed how well it performs, is thrilling. This is something else than the Turing test because it should not be known beforehand whether it is a human or not, but that doesn’t make it less interesting. Problems will arise when the AI’s starting to become too humanlike and at the same time have their own agenda. Sooner or later it would become really difficult to be able to control all the possible outcomes when dealing with an AI.

#### Black Mirror (2011)

(image from themoviedb)

Creator: Charlie Brooker
Genres: Sci-fi, Drama, Mystery, Thriller

Black Mirror is a TV show exploring new technology, society and possible scenarios from today to a more distant future. It has been called a modern day The Twilight Zone (1959), also a great show but many episodes can feel rather outdated. Every episode is standalone and explore a new theme. This means that viewers don’t have to see the episodes in a chronological order and can cherry pick the ones that seems most interesting, and even skip the first episode in the first season. This episode is mostly built on shocking the audience and is not representative for the rest of the show. Some favourites:

The Entire History of You (2011)

Synopsis: Some people have an implant recording everything they see and hear.

Think Google Glass but built into the body and it is not hard that this could create all kinds of problems. At the same time a lot of people were really excited about the idea of wearing glasses recording everything and honestly it is really not that far from our current reality.

San Junipero (2016)

Synopsis: Two women meet in a California-esque small town named San Junipero in 1987.

This is not a story about relationship with an AI. This is about human relationships, love, consciousness and how future technology could change our lives for the better.

Hang the DJ (2017)

Synopsis: In a maybe dystopian future two people, Amy and Frank, try out a dating/matching service that always puts an expiration date on all relationships. The reason for this is that each short relationship will give “the system” more data to eventually find the optimal match for each individual using the service.

So what if you could see the expiration date for a relationship, is it something you really want to know. And how would this knowledge change your actions? It is not that far from the classic theme of knowing the day you will die.

You have been working with electrical systems and software in vehicles your entire career. Why?

Actually, I’m not into cars or any other vehicles specifically. What drives me is mainly to solve problems in a challenging technical environment. It might as well have been airplanes or something else entirely. It just so happened that cars and buses had the right combination of technology and challenges to attract me.

How has your career developed at Combine?

The first step was into the telematics area. From there I moved on to Infotainment, and currently I am helping a customer with IT and processes relating to software development and software management.

When Combine sends you to help a customer, what can the customer expect?

I have realized that I have a knack for understanding how the processes, support systems and organization in a company are meant to facilitate technical development. Once I understand this, I bring out my broomstick and begin clean-up operations so that things work the way they are supposed to. Consequently, my CV is full of activities such as ”responsible for project documentation”, ”task force leader”, ”team leader” and similar. In my current assignment I also have the opportunity to develop improvements and IT that raise the quality of the customers processes.

If you could choose a completely different assignment, what would it be?

I believe that we have the technology, or the ability to develop it, needed to help solve some of the big issues facing us globally, issues like our impact on the climate. Creating the right incentives and mechanisms, as well as developing the solutions themselves, would be really stimulating and interesting.

It certainly does. I have been brewing beer for many years and I finally started a microbrewery called Sad Robot Brewing. Being who I am I tried to learn as much as possible about the engineering side of all the steps in brewing processes, such as chemistry and thermodynamics. Just like at an assignment I like things to be clean and controlled, so the only solution was to team up with some friends and do it ourselves. It was just like a second job. I spend less time on brewing nowadays, but one of the things I have done lately is to mentor a thesis project at Combine aimed at controlling and monitoring the brewing process (Editor’s note: you can read about this in the Combine blog here).
I also do some acting in theatre and movies, so not everything is technical. And yes, you can probably figure out what kind of books and movies I like from the name of the brewery.

#### Image processing in Sympathy for Data

This is the second blog post in a series of posts on image processing using Sympathy for Data, an Open-Source tool for graphically programming data-flows. See the previous entry for an example of how you can read the time from an analog clock using only basic image processing building blocks. No programming required.

When it comes to object recognition today most people think about deep learning and throw vast datasets onto deep machine learning algorithms — hoping that something will stick. One thing that all such algorithms have in common is that they all have a large number of parameters, requiring an even larger number of examples to be trained. There are two major costs associated with this approach: firstly the computational cost in training the datasets, usually using a single or a cluster of high-end graphic cards; and secondly the difficulty in acquiring large enough datasets to do the training with.  Sure, there exists techniques for artificially extending existing datasets into larger ones in order to help against over fitting, but even these cannot handle the case of datasets with only a hand full of examples. With all the hype of deep learning it is easy to forget that earlier approaches to object recognition, while much more limited in what they could solve, did not suffer from these difficulties and can sometimes still be favourable to be used.

If we look back at when image recognition was first considered as a problem to be solved with computers we see that the problem was at-first greatly underestimated. Back in the summer of 1966 a very optimistic project was started at MIT using only the student summer workers that year and with the aim of solving the computer vision problem. As you can read in the PDF the final goal was, in hindsight, a quite ambitious one indeed:

“The final goal is OBJECT IDENTIFICATION which will actually name objects by matching them with a vocabulary of known objects”.

Needless to say, this task proved more complex that what was first imagined, and have since led the the creation of a whole field of research. It is not until recently, more than 50 years after that summer project that we can say that general purpose object recognition is a more or less solved or solvable problem.

In my previous image processing post we looked at a simple image processing task in reading the time from an analog clock, and showed how this could be solved using the image processing tools available in Sympathy for Data, all without having to write a single line of code.  A major factor in this solution was by limiting ourselves only to images acquired in a very specific way. This solution generalizes more to industrial image processing such as eg. reading a pressure valve rather than doing general purpose like reading like a random clock you find on the side of a building.

In this and the upcoming image processing post I will show how we can use the image processing tools and the machine learning tools of Sympathy to similarly solve an object recognition task under well defined circumstances. These circumstances generalizes again more to an industrial setting, such as analysing objects on a conveyor belt, where we can have a clearly defined environment and camera setup.

For this purpose we will have a camera mounted straight above the incoming objects. The objects are photographed against a neutral background (white) clearly distinguishable from the objects themselves (metallic grey). Furthermore we ensure that the lighting is smooth and even over the whole area and that no sharp shadows are cast by the objects themselves or anything else. In the example dataset used here we use pictures of a mix of fasteners, with the target of identifying the screws. Furthermore we ensure that objects are overlapping since it would require more advanced techniques to separate overlapping objects,  a problem almost as hard as object recognition itself. If we would like to do this in an industrial setting we could use a mechanical solution to ensure this before the objects enter the belt, eg. using a suitable hopper.

#### Segmenting the image

We will start by solving the problem of segmenting and labelling an input image, with the task of deciding which areas of the image correspond to different objects.  The intention here is to pick out individual objects and to classify each found object whether it matches the target object.

Thus our workflow will contain the following steps:

1. Separate the image into pixels that belong to objects or to the background
2. Cleanup this image to remove noise and to completely close all objects
3. Create labels for each pixel
4. Extract a list of binary image masks, one per found label.

A typical step in many image segmentation tasks is to use a simple thresholding algorithm. We can use simple thresholding and the fact that the metallic grey objects all are darker than the background paper in order to create a binary representation of the pixels that belong to objects. We start by attempting to use a simple basic threshold at the value 0.5.

Note that we added a filtering step that inverts the image by scaling it by a factor of -1 and adding an offset 1 to it before we do the thresholding. Thus we can ensure that a completely dark pixel (value 0) becomes 1.0 before thresholding and is classified as a “true” boolean after the thresholding.

We can also note that the result of the basic thresholding is quite poor, We incorrectly classify the bottom half of the image as belonging to an object. If we raise the threshold until no background is classified as an object, then we instead start losing pixels from the objects that are classified as background. You can see this effect in the images below, where we have a higher threshold on the right side than on the left side.

Furthermore, just using a simple scalar value as a hard-coded threshold will not work very well if there is even the slightest change in global illumination from picture to picture.

We can use one of the automatic thresholding algorithms that automatically finds a scalar suitable for thresholding. The simplest automatic thresholding algorithm is the mean or median which sets the threshold such that half the image will be True and half the image False. This is however seldom good, and most definitively not good for our application since we are almost guaranteed that background (which is more than 50% of the image) is classified as part of the objects.

Other alternatives to automatic thresholding include a number of algorithms that consider the overall distribution of pixel values and tries to find a suitable threshold. For example the Otsu algorithm assumes that the pixel values follows a bi-modal distribution and find a global threshold that minimises the variance within each found class.

The results of Otsu is surprisingly good for most images, as you can see in the image above. However we note that this algorithm still misses some parts of the objects (see the upper edge of the circular washers in the image above). Sometimes, it is impossible to get a good enough result by just setting a single global threshold value.

Other alternatives exists that perform an adaptive threshold that considers a window around each pixel and calculates a threshold value for that pixel based on this window. With this technique we for instance can easily compensate for any unevenness in the overall lighting.

One example of this is an adaptive gaussian thresholding method. Here we first perform a low pass filtering with a gaussian kernel of size 21 and sigma 11. We take the lowpass filtered value and apply an offset (-0.01) before testing if it is higher or lower than the pixel that is being thresholded. We picked the value for the kernel size based on the overall size of the objects (the circular ones are approximately 20 pixels wide). The offset compensates for small irregularities in the background itself.

The noise on the background can be removed in a later stage using morphological opening. Before we progress to this however we consider one more approach which is to instead extract all the edges in the image and to perform morphological operations to close the objects based on the edge data.  We do this by applying a Canny edge detector to the raw input image (no pre-scaling step needed anymore). As we can see below this method generates no false positives and does capture all sides of the objects.

The interior of the objects can filled in using morphological closing after the Canny edge detector. What this does is to perform to perform a dilation operation followed by an erosion operation where the dilation makes all objects “thicker” by a given radius and the erosion makes them correspondingly “thinner”. Each of these operations are done by checking a neighbourhood around each pixel and taking the MAX or MIN value in the neighbourhood, respectively.

Consider the image on the left side below. In this image if we perform dilation then we get a white pixel in the areas marked red and green and only the area marked in blue would get a black pixel. If we instead perform erosion then we get black pixels in the red and blue areas and only the green area stays white. In the right side of the example below we can see the result of performing the erosion operation followed by a dilation operation. It has first made the white objects significantly thinner, followed by thicker.

For many objects making them thicker followed by thinner would not change the overall shape of the object. However, if two edges both become thick enough to touch each other then there is no black areas in the middle that can make them thinner again. Thus the end-result is that the objects have been closed as can be seen in the images below:

One problem that we can spot with the morphologically closed image is that some objects are now touching each other due to the thickening radius being larger that the distance between the objects which have created small bridges between some of the objects. To compensate for this we can perform a morphological opening that removes the small bridges between the objects. This step also removes all the small dots of false positives given by the thresholding algorithm if that one is used instead of the edge detection.

For the final step before we can start working with the objects it to use labeling to create a unique ID for each object. The labeling algorithm takes a binary image as input and creates an image with integers for each pixel. The integer values of a pixel correspond to a unique value for each object. If there were even a single pixel linking two objects to each other then both objects would be assigned the same integer value. We can visualise the result of this step by clicking on the object, this gives a pseudo-colour for each object based on a default colour map.

Note that since objects that are close to each other have similar ID’s then they are mapped to almost the same color. The ID values assigned differs even when not evident in the image below:

One final node that is useful is to create a list of all the found objects. The node Image to List can be used to convert the labeled image into a list of images. Use the configure menu to select “from labels” to do this conversion.

As we can see in the preview window below we have a list that contains many images. Each entry in the list is an image mask that is true only for one single object (as defined by the unique ID’s given by the labeling operation). We will use these images as the inputs to our classification algorithm to detect the individual objects.

#### Summary

In this post we have looked at the segmentation problem and shown how simple thresholding or edge detection algorithms can be used together with morphological operations and labeling to create a list of objects in an input image. This list of consists of a mask singling out each individual object in the image, one at a time. In part 2 we will continue to perform the classification of each found object.

Märta, why did you choose engineering?
I have always been interested in technology, wanted to know how stuff works. I also liked math and physics and thought it was kind of easy. In gymnasium I first planned to study natural science but ended up choosing more technology-oriented classes since the combination of math and reality was tempting. I think that might also have been a reason to why I focused so much on control theory.

What was the best part of your engineering studies?
Without a doubt my time spent as an exchange student at University of California Santa Cruz!

That sound like a great experience!
Yes, it was fun to take other courses than what was available at Lund University. I also got the opportunity to work in the Autonomous Systems Lab, playing around with robots and drones. This was very valuable since it was like a mix of working, studying and doing research. California is also such a great place so besides studying I spent a lot of time surfing and skateboarding.

Autonomous drones sound like the optimal way to apply math in reality. How did you move on from that?
Well, after California I returned to Sweden in time for my master thesis. Since I had spent quite some time working with autonomous systems and drones I wanted to do my thesis in that area. With that said, I was thrilled when the perfect project was available at SAAB.

It was about controlling a swarm of autonomous flying drones. Having multiple drones in a swarm leads to many interesting problems ranging from internal distance estimation between the drones to the high-level behavior of the swarm.

So now you work as an engineer, is it all you thought it would be?
Well, I never really had any clear picture of exactly what it means to “be an engineer”. It wasn’t until the final years at the university I started to get a better picture of what it means. But yes, I work with applied mathematics every day so in that sense it is what I envisioned.

How does a typical day at work look like?
I work in an agile environment, kind of like scrum-ish… The day starts with a daily scrum meeting where we go through what we work on and potential issues. After the meeting it’s time to start work on my current tasks. Right now, my main focus is on PLC programming, coding new features and testing them out at the machine or in a virtual environment. Some time is also spent on developing the virtual test rigg, bug fixes etc. My days are very flexible, and I control a lot of the time myself and that suits me perfect.

Why did you choose Combine?
I started my career at a larger consulting company. I liked the role of a consultant, but I felt that I wanted to work for a company more focused on the technologies I’m interested in. I had also heard good things about Combine from friends.

Do you also want to work with applied mathematics and control systems development as a consultant at Combine? See if we have any available positions, or just give us a call and see if we have something coming up soon.

For todays post we have a guest author, Lia Silva, that works in Data Science and have her own blog https://statsletters.com/ on mathematics, statistics and other fun stuff. Without any further presentation, see what Lia have to write about studying our relationship with consumerism through graph theory and statistics:

One important difference since a decade ago is what can be measured about how a user consumes a product. Lately, it seems like every little thing that we do can be used to build a projection of ourselves from our habits. And in the end, that projection can be used to poke the reptilian parts of your reward circuitry so they release the right cocktail of hormones. A cocktail that makes you choose bright red over dull gray, reach for your wallet or click “I accept the terms and conditions”.

Through the years, recipes for such cocktails have been perfected by different disciplines. As an educated consumer, actively tasting those recipes in modern products can be as interesting as wine tasting, minus the inebriation. This is the first of a series of posts intended to help you be more aware of your own reward circuitry by using interpretations that different algorithms build from observing your measurable actions.

Another intention with this series of blog posts is to show that the methods are not necessarily:

• Absolute
• Inherently objective
• Infallible

And definitely NOT suitable to use blindly e.g. “press the Analytics Button and have the neural network tell me everything”*. If anyone promises that without disclosing any assumptions, make sure to ask LOTS of questions.

The “Serpent people” series will present some textbook representations suitable for modeling this problem, aspects that are better reflected on each one of them, and trying out different open-source libraries on different artificially generated models of “people according to what we know about them”.

Some of that material is already in a very fluid shape in this notebook if you can’t wait to play by yourself :). The representation in there is simply what I considered to be natural for the problem itself. I plan on elaborating that representation with classics such as Frequent Itemset Mining and Associative Classification. For those, you can start by checking out Chapter 10 of “Data Mining” by Mehmed Kantardzic.

And that’s the teaser for what will come. For now, I will leave you with this David Bowie Song. Granted, it’s “Cat People” instead of “Serpent People”, but pretty cool still.

## The basic idea

The basic idea we have for reading the time (or any other analog device!) is to first capture an image of the device using a web camera and use the image processing tools in Sympathy for Data to read the hour and minute hands from the clock. Our goal is to do this with only the open-source image processing tools that are included in Sympathy for Data, and no custom code.

Our aim is not to create a solution that works for any image of any clock, but rather to show how we can reliably read the values of the given clock given a very specific camera angle and lighting situation. To do this for any clock and lighting setup requires more advanced techniques, often involving machine learning, and which harder to guarantee that it works in all of the target situations.

For the purpose of this blog post, we will not be using machine learning or any other advanced algorithms but rather rely on basic (traditional) image processing techniques. This problem resembles very much that of designing image processing algorithms that are used every day in industrial production — where you have a highly controlled environment and want simple and fast algorithms.  By controlling the environment we can make this otherwise complex task very simple and easy to design an algorithm for. When doing image processing for industrial purposes it is commonly found that you can simplify the problem and increase robustness by heavily controlling the environment and the situation in which your images are acquired.

For the impatient, you can download the dataset with images that we prepare in the first part as well as the finished Sympathy for Data workflow.

## Acquiring the data

Step one is to acquire some images of the clock and to try to analyse them in Sympathy. If you don’t want to repeat these steps yourself you can simply download the full dataset from here.

We start with a naive approach and just place the web camera in front of the clock and record images once per minute over a full day. Some example of these images are below:

What we can see in the images above is two main problems: the lighting varies widely depending on the time of day, and the lighting in the image itself varies widely due to specular reflections and makes part of the hour-hand invisible second image above. Whatever algorithm we come up with will have a hard time to deal with an invisible hour-hand.

We can also note that with a stronger and directional light we would have shadows cast by the hour and minute arms of the clock is projected at different part of the face of the clock depending on the incoming light direction. If we where to directly try to analyse these images we would need to compensate for the shifting position of the shadows, and we could not use any simple thresholding steps since the overall light changes widely. Any algorithms we create for these kind of images will inherently be more complex and possibly more prone to failures when the light conditions change. We would need to test if over a wide range of conditions (day, night, sunset, rainy weather, sunny weather, summer, winter) to be sure that it works correctly under all conditions.

An attempted first fix for the specular reflections was to remove the cover-glass of the clock, the idea being that the face paint of the clock wouldn’t be reflective enough to give these issues. This however wasn’t enough to solve the problem with disappearing arms for all lighting conditions, and are we to apply this idea to an industrial setting it may oftentimes not be possible to make such a modification. A better solution is therefore to remove all direct light in favour of a setup with only diffuse lights.

To solve both these problems we choose to make a controlled environment with only diffuse light and where we know that the only thing that changes are the positions of the clock’s arms. We force a constant light level by enclosing the clock and camera in a box with a lightsource. This way we can also eliminate the the shadows of the clocks arms by placing the lightsource from the same direction as the camera.

## Building a camera friendly light source

In order to eliminate the shadows from the clock arms and to provide a even and diffuse lighting conditions we place a ring of white LED’s around the camera. Usually this is done using professional solutions such a light ring for photography, but we’ll manage with a simple 3d-printed design and some hobby electronics. You can find the downloads for these over a Thingiverse.

The design for this light is simple ring where we can add the lights plus a diffuser on top of it to avoid any sharp reflections.

After printing the parts above we place a number of white LED’s in the small holes in the middle part above. By twisting the pins together with each other on the underside (take care with the orientation of anode and cathode!) we can easily keep them in place while at the same time connecting it all up. See if you can spot the mistake I did below in the wiring. Fortunately it was salvageable.

Next step is to place the diffuser over the LED’s and to attach it onto your web camera. Power it with approximately 3V per LED used. Point it straight at the clock and put an enclosure over it. We can used a simple carton box as a simple enclosure that removes all external light.

Congratulations, you can now get images of the clock with perfect lighting conditions regardless of sunlight, people walking bye, or any other factors that would complicate the readings.

## Pre-processing the images

When doing image processing it is common to operate on grayscale images unless the colour information is an important part of the recognition task. For this purpose we first run a pre-processing step on the whole dataset where we convert the images to greyscale and downsample since we don’t need the full resolution of the camera for the rest of the calculations. This can easily be done using Sympathy for Data.

First create a new flow and point a Datasources directory to a copy of the dataset (we will overwrite the files in place). Add a lambda node by right clicking anywhere in the flow and select it. Connect a map node to apply the lambda for every datasource found.

Before you can run it, add the nodes below into the lambda node to do the actual image conversion. You need to select greyscale in the configuration menu of the colour space conversion node, and rescale X/Y by 0.5 in the transform image node. Note that the “save image” node here overwrites the images in place, so try not to run the node until everything is ok. Another option would be to compute a new filename to be used instead using eg. a datasource to tables node and a calculator node.

In the dataset that you can download we have already done these conversions (downscaling to 800×500 pixels) to save on bandwidth.

## Analysing the images

Our goal in this section is to create a Sympathy workflow that allows us to take any image of the clock and convert it into an hour and minute representation of the time.

### Creating a template

For the first step we want to extract only the arms of the clock that should be analysed. For this purpose we will use a practical trick to easily detect only the moving parts of the image. We do this by first calculating a template image that show how the images would look if all the moving parts were removed. This trick only works when the camera is fixed and there are no major changes in overall lighting.

Start by calculating the median (or in our case max since we know the arm’s are black) of a few different images from the dataset. This will give an image where the arms of the clock is removed.

Select max as operator in the Overlay Images node below. This works since we know that the arms are darker than the background, and whichever pixel has a light colour in any of the images will have a light colour in the final image.

The results look surprisingly good given that we only used a few images (where the arms where all in different positions):

We save this template in a separate file so that we don’t have to redo the calculation for each image that should be processed.

### Extracting the minute and hour arms

Continue by creating a new workflow and load one of the images to be analysed. We start by making a subtraction of the image to be analysed from the template image.

This will give an image where only the arms are visible, you will need to select “subtract” as the operation for the “overlay images” node.

The next step will be to perform a threshold to pick out only the arms of the clock as a binary image. To do so we use a Threshold node and set it to basic threshold. We can figure out a good threshold level by looking at the histogram above of the image after subtraction. We see that the maximum value of the image is 0.55 and that something significant seem to happen around the 0.4 mark (note that the graph is logarithmic!).  We set the threshold to 0.35 and get the results shown above.

Since there can be some small smudges and missed spots on the binary image we apply morphological  closing on the image using a structuring element of size 20 which should be more than enough to compensate for any missed pixels caused by noise, scratches on the object/lens or the otherwise black areas of the image.

Finally, we can note that we only actually need to see the tips of the hour and minute hand in order to read the time. If we sample to check for the the minute arm at every point in a circle with a radius closer to the edge of the picture we can know which pixels belong to the minute arm. Similarly, if we sample every point in in a smaller circle we get one or two sets of pixels corresponding to the hour arm when it is below the minute arm or  both the hour arm and minute arm when they are not overlapping.

We can do this sampling by first creating two new templates that we use to select a subset of the pixels. This is done by drawing a white ring on an otherwise empty image for each of the two selections. We can do this using the Draw on image node and a Manually create table node that gives the XY coordinates (416, 256) and radius 200 for a circle with colour 1.0 and 170 for a circle with colour 0.0 — corresponding to the ring selecting the minute arm below:

After multiplying these two templates with the thresholded image (using again the overlay images node)  we get two new images with blobs corresponding to the tips of the hour and minute hand:

All that is left now is to extract the coordinates of these blobs and to apply some math to convert them into hours and minutes.

### Computing the minute

We can compute the position of the minute hand by using a Image statistics node with the algorithm “blob, DoG” with a threshold of 0.1. This algorithm finds “blobs”, or light area on a dark background, in an image by subtracting two low-pass (gaussian) filtered version of the image filtered at different scales.

All other parameters can be default, but the default value for threshold of the difference-of-gaussian algorithm is too high for our inputs.

Now all we have to do is to convert the XY values 443, 426 of the tip of the minute hand into an actual value in the range 0 – 60. We can do this by calculating the vector from the center of the clock determined from the raw image as (416, 256) to the point of the detected blob. This gives us a vector (187, 10). By taking the arctan of this vector in a calculator node we can get the angle to this point and convert it into minutes. Note that we invert the y-component of this vector to compensate for the difference in coordinate systems (y-axis in images point down):

### Computing the hours

In order to compute the hour we need to eliminate one of two possible candidates for the hour. Consider the blobs shown below, from just this data it is hard to know which hour it is:

However, what we can do is to take the position of the minutes that we calculate above and clear out one of the two blobs above. Since we know the radius that we used for the circle multiplied with the data, we can easily draw a black area on top of the location where the minute arm is located and at the given distance from the centre:

For this purpose we use another calculator node to compute the X/Y coordinate above and to draw a black circle onto the image at that location. For clarity it has been drawn as a brown circle in the example above to see what area of the image is deleted.

We extract the hour value from the remaining blob, if there is one, similarly as to how the minute value was calculated. Note that if there is no blob in the image containing the tip of the hour arm then the expression belows gives a NaN value.  This happens when it is under the minute arm.

Finally, we can finish the flow by adding in a special case calculation that checks if the ‘hour’ column has the NaN value and if so instead derive the hour position from the minute position. In this step we also round the minutes to even number and round the hours down to nearest smaller integer. Note that we subtract a fraction of the minutes from the hours before rounding due to how the hour arm moves closer and closer to the next hour as the minutes raise.

We also subtract 1 (modulo 60) from the minute position since the captured images where all slightly rotated clockwise. We could have compensated this in the original pre-processing if we had noticed it earlier.

## Time to check the results

Before we are happy with the flow, let’s check how well it performs versus the ground truth. Since the timestamp was saved when each image was captured we can easily compare these values with the results of the flow. Due to the sampling process and since the seconds of this clock wasn’t synchronized with the seconds of the computer sampling them — we should expect to be off by one minute in some of the readings.
As we can see in the table below we successfully read the time, with a difference of at most one minute, for first 100 images.

By only allowing ourselves to use the first 50 images when we developed the flow, and then validating it by running on a larger dataset we gain confidence that the algorithm works for all the situations it will encounter. We have run it the full dataset of 700+ hours without any other errors.