Blogg | Combine

# Blogg

ODF will be an arena to build competence and nurture innovation. It is open to all who believe that crunching data from the ocean is first of all fun, secondly, holds the answers to a sustainable blue economy and, thirdly, gets really productive when different competencies work together! Data collected from the ocean poses challenges such as numerous data sources with varying characteristics and time scales, communication difficulties and harsh environment for the sensors which can lead to poor data quality. Overcoming these challenges using efficient AI will be vital for the future of the blue economy and sustainable ecosystems.

ODF will be headed by professor Robin Teigland from Chalmers University of Technology. SCOOT (Swedish Centre for Ocean Observing Technology) takes on the coordinating role. Stay tuned for more information in the future.

ODF is part of Vinnova’s investment to speed up development within AI.

#### Introduction

The issue of deep learning (DL) is a hot topic in modern society and is one of the most rapidly growing technical fields today. One of many subjects that could benefit from deep learning is control theory. Its nonlinearities enable implementation of a wider range of functions and adaptability to more complex systems. There has been significant progress in generating control policies in simulated environments using reinforcement learning. Algorithms are capable of solving complex physical control problems in continuous action spaces in a robust way. Even though the tasks are claimed to have real-world complexity it is hard to find an example of such high-level algorithms in an actual application. Moreover, we have found that in most applications in which these algorithms have been implemented, they have been trained on the hardware itself. This does not only enforce high demands on the hardware of the system but might be time-consuming or even practically infeasible for some systems. In these cases, a more efficient solution would be to train on a simulated system and transfer the algorithm to the real world.

Furthermore, one might wonder if a traditional control method would perform better or worse on the same system. In order to recognize how well the deep learning algorithm is actually performing, it would be interesting to compare it to another method on a similar control level.

The main purpose of this project was to provide an example of a fair comparison between a traditional control method and an algorithm based on DL, both run on a benchmark control problem. It should also demonstrate how algorithms developed in simulation can be transferred to a real physical system.

#### Design

Due to its unstable equilibrium point, the inverted pendulum setup is a commonly used benchmark in control theory. There can be found many variations of this system, all based on the same principal dynamics. An example of this is a unicycle which principal dynamics can be viewed as an inverted pendulum in two dimensions. Thus, as a platform to conduct our experiments, we constructed a unicycle.

Figure 1: CAD model of the unicycle

Our main focus for the design was to keep it as lightweight and simple as possible. To emphasise the low hardware requirements, we chose the low-cost ESP32 microcontroller to act as the brain of our system. On it, we implemented all sensor fusion and communication to surrounding electronics necessary to easily test the two control algorithms on hardware. We dedicated one core specifically for the two control algorithms and added a button to switch between the two algorithms with a simple press.

To be used in simulation and control synthesis, we derived a nonlinear continuous-time mathematical model using Lagrangian dynamics. The unicycle is modelled as 3 parts, the wheel, the body and the reaction disk, including the inertia from all components in the hardware. It has 4 degrees of freedom; the spin of the wheel, the movement of the system, the pitch of the system and the rotation of the disk. The external forces on the system come from the disk and wheel motors.

#### Controller Synthesis

The infinite horizon linear quadratic regulator (LQR) is a model-based control problem which results in a state feedback controller. The feedback gain is determined offline by from an arbitrary initial state minimizing a weighted sequence of states and inputs over a time horizon that tends towards infinity. The LQR problem is one of the most commonly solved optimal control problems. As a mathematical model of the system is available and due to its characteristics, we implemented an LQR controller for this project.

For our deep learning control of the unicycle, we chose proximal policy optimization (PPO). The method is built on a policy-based reinforcement learning which offers practical ways of dealing with continuous spaces and an infinite number of actions. The PPO has shown superiority in complex control tasks compared to other policy-based algorithms and is considered to be the state-of-the-art method for reinforced learning in continuous spaces.

To make a long story short we trained the algorithm for the system by writing up the mathematical model of the unicycle in Python as an environment for the agent to train in. The actions the agent can take are the inputs to the two motors. After taking an action it moves to a new state and receives a reward. After some millions of iterations of taking actions and receiving rewards the agent eventually learns how to behave in this environment an creates a policy to stabilize the unicycle.

#### Results

Both methods successfully managed to stabilize the system. The LQR outperformed the PPO in most perspectives in which the hardware did not limit the control. As an example, in practice, the LQR managed to stabilize from a maximal pitch deviation of 28 degrees compared to the PPO method which managed 20 degrees. We observed this sub optimal behaviour of the PPO in several situations. Another example can be seen when applying an external impulse to the system.

Figure 2

As can be seen, the LQR handles the impulse in a somewhat expected way while the PPO goes its own ways.

This unexpected behaviour is not desirable for this system but we think it might be seen as beneficial for other systems. For example, systems with unspecified or even unknown optimal behaviour. However, for systems with a specified known optimal or expected behaviour, we would recommend the good old LQR, if applicable.

Even when exposed to model errors, the PPO did not show any sign of unreliability compared to the LQR in states it had encountered during training. However, when introduced to unknown states, the performance of the PPO is impacted. By keeping the limits of the training environment general enough this should not be an issue. However, when dealing with systems with large or even unknown state limits, LQR is probably a safer option.

We believe our project has shown a good and fair comparison between these two methods on the same system as well as has given a good and informative example of how a DL algorithm trained in simulation can be transferred to a real physical system. The unicycle is of course only an example of such a system, but we feel like we encountered a lot of interesting features that can be generalized and used to benefit other projects. If you have doubts, please read our report!

“The cloud” or properly referred to as “Cloud Computing” is a general term used when talking about computations or data storage on a server at another location accessible through an internet connection, i.e. “in the cloud”. Hence the computation is not made on your local computer nor on a local server.

The area of use for cloud computing has primarily been within data/file storage and offline computation, whereas the use of online computation is currently on the rise. It is particularly strong within the field of data science which looks at large amounts of data to get insight and retrieve relevant information for taking appropriate decisions. This is one of Combine’s specialty areas and our open-source tool Sympathy for Data and its cloud companion Sympathy Cloud. https://www.sympathyfordata.com

#### Connected services within the automotive industry

An area where online cloud computation especially is on the rise is within the automotive industry. Because once one automotive manufacturer delivers functionality within a new growing area the race begins, and it has officially started. Some of the key players so far are:

#### How it works

One application of the cloud within the automotive industry is to keep track of the location of all connected vehicles at any time and assess whether a particular vehicle requires information for its current trajectory. This information could, for example, be of informative or safety critical character or crucial for the driver/vehicle to plan ahead.

The assessment is broken down into localization, trajectory estimation and translation of everything to the same timeframe. Depending on where you put the responsibility of the final assessment, you could either have the cloud make the complete assessment or have the necessary information be sent down to the intended vehicle for final assessment through onboard data processing.

The main application for this technology so far has been within the information flow to the driver of potential safety threats along the road ahead, such as hazardous obstacles (broken down vehicles or road work areas) or ambient information (road friction). But by introducing cloud-to-cloud communication, where for example one cloud can hold infrastructure information (such as traffic light information), the spectra of information are broadened.

However, the main benefit of using the cloud would come by having this as a “mother of all sensors”-sensor, i.e. by being able to receive data of your surrounding which is statistically substantiated. It would then be able to assist in reaching level 5 automation for autonomous driving vehicles (https://www.gigabitmagazine.com/ai/understanding-sae-automated-driving-levels-0-5-explained).

The main challenges on reaching there are:

• Latency; as it is a real-time system, the latency of both, communication as well as computations, are crucial.
• Computational reliability; as always, the computational reliability is crucial, but when being within an online cloud computation framework the information sent to the vehicle needs to be fully reliable to be able to relentlessly act on it.
• Having enough data; both having enough data from a vehicle to assess a situation but also having data from a sufficient number of vehicles.

#### The future of online cloud computing

As internet accessibility is getting better and data transmission is getting cheaper, especially with 5G on the horizon, the future of online cloud computing looks promising. If more and more companies go towards data sharing, the range of possible applications looks even more promising.

Nevertheless, the main issue when collecting and sharing personal data (such as vehicle data or data from your mobile phone) in such quantity and quality, will, however, be integrity. Sharing and storing data continuously at every instant will open the door to misuses such as surveillance or tracking, no matter how anonymized the data may be. And thus, we are moving towards an “all-seeing eye” society.

Kickstarted by the need for portable electronics such as phones and laptops and fueled by the increasing demand in storage capacity, battery technology has seen unprecedented improvements over the past decade.  Nevertheless, even the most advanced battery types share a common trait – they degrade over time, decreasing the amount of potential energy (or capacity) they can store. After they drop to about 80 % of their nominal capacity – typically after several thousand cycles – they are considered at the end of their so-called remaining useful life (RUL) and usually discarded. But the end of RUL does not have to mean retirement. Batteries are prime candidates for second-life applications in systems featuring lower requirements, stationary applications or auxiliary units. But how can we determine how long the battery has left until absolute failure? After all, any investment in cycling applications requires a safe assessment of remaining capacity and projected degradation trajectory. Enter another promising field experiencing rapid growth – machine learning.

One can argue that the most crucial part of machine learning is acquiring wellannotated data. Even the most complex model will not be able to make sense of a dataset that is too small, missing key variables or of poor quality. Without good quality data, the model choice or parameter tuning is of little importance. The team behind ”Data-driven prediction of battery cycle life before capacity degradation” (Severson et al., 2019) provides an extensive dataset monitoring battery degradation. In their paper the authors claim the dataset to be the largest publicly available dataset so far, containing close to 97,000 charge-/discharge cycles of 124 commercial LiFePO4 battery cells.

During the charge-/discharge cycles, the team continuously monitored voltage, current, temperature and internal resistance of the batteries. External factors influencing battery degradation were limited during the data generation process by performing measurements in a temperature-controlled environmental chamber set to 30 °C.  The batteries were subjected to varied fast-charging conditions to facilitate different degradation rates while keeping identical discharge conditions.

In the feature engineering process, the authors found a single feature to have a linear correlation factor of -0.92 with the log of RUL. This feature was the log of the variance of the difference between discharge capacity curves, as a function of voltage, between the 100th and the 10th cycle. Thus, the engineered feature could be used by itself to achieve great prediction results of RUL after the 100th charge/discharge cycle.

Using Elastic Nets, with default parameters, we obtained the following prediction results after the 100th cycle.

Note that we decided to use the first cycle as a reference when creating the feature instead of the 10th suggested in the paper. Next, the degradation of five (randomly selected) individual batteries is shown together with their predicted RUL. Its exact value is difficult to predict correctly but may be of sufficient precision for simple classification tasks.

While the engineered feature used has an outstanding correlation with RUL it is nevertheless very restrictive. Using such a feature basically means performing 100 charge/discharge cycles in a controlled environment before a prediction is possible. In a commercial setting, such a setup will be too time-consuming, costly and thus impractical. Therefore, finding other features that are less restrictive but still offer good predictive performance is important. For example, we found that using the first cycle as a reference instead of the 10th is a suitable candidate for predicting RUL, increasing the commercial viability of the prediction method. The figure below visualizes the correlation coefficient between the feature and RUL for each cycle up to 200 cycles.

The figure tells us that the linear correlation is the largest around 100 cycles. It might still be possible to use the feature already after 40-50 cycles with a smaller reduction in performance. Keep in mind that this is the linear correlation most indicative of prediction performance for linear machine learning methods, while non-linear methods can find useful information for prediction even though the linear correlation is low.

Until now, we have only considered one feature for predicting RUL, but several more can be engineered from the charge-/discharge cycle data. Introducing more features leads us to another problem, namely, feature selection. There are regression methods that can report on the feature importance for prediction following their training. An example of such a method is the Random Forest Regressor (RFR), which is also a non-linear estimator. Supplied below is an example of feature importance an RFR after fitting.

Using the smallest subset with a joined total importance of more than 0.8 the top 6 features were selected and the following prediction chart was obtained on test data.

As can be seen, the predictions are the best when the RUL is smaller than 200, but between 250-1500 the mean prediction is close to the RUL at the cost of increasing variance. Only half of the battery cells live longer than 850 cycles, which decreases the amount of training data for larger RULs and will introduce a bias towards better predictions for smaller RULs.

The eager reader might be wondering about the elephant in the room by now – what about actual batteries? After all, they would be the focus of any real battery lifetime prediction efforts. The discussion above, however, has considered individual cells, tens or even hundreds of which are usually combined into battery packs to obtain the necessary voltage and current in commercial applications. Unfortunately, no public dataset of the same magnitude as the Stanford study is available. As a stopgap measure, we can construct a collection of virtual batteries by simply aggregating the cells to mimic existing products – e.g. a 72Ah LiFePO4 pack. This method is akin to bootstrapping, in this case choosing 68 cells with replacement from the dataset to obtain a collection of training and testing packs. It is also a poor approximation of reality, since it does not model any of the possible complex interactions within a pack. It thus servers more as a preview of possible future studies. We can then train a Random Forest Regressor on the whole lifetime until failure of the training selection and predict RUL of the testing collection. The figure below presents the resulting predictions in orange against the blue RUL lines of five selected packs, showing not only their striking similarity but also a high accuracy of our predictions ($$r^2$$ ≈ 0.99). It serves as a depiction of our main idea – by aggregating the cells together we effectively “smooth” over their individual impurities, removing uncommon outliers and thus enhancing the predictive capabilities of our model.

Of course, the applicability of this approach needs to be tested extensively against real-world battery health data. Gathering and analyzing it will be the next exciting step on our journey and our contribution toward saving the world. After all, we wholeheartedly agree with Mr. Anderson, and humbly add – the future is electric or not at all.

AiTree Tech, together with Combine, is eager to help contribute to a better environment and a sustainable future. While many actors are focusing on today’s transfer of energy source from fossil fuels to batteries, we try to look ahead and are focusing on how to reuse batteries after their 1st and 2nd life.

ENTER THE NEXT LEVEL

AiTree technology is providing a data driven machine learning solution with the purpose to predict lithium-ion batteries (complete) lifetime from 1st life to end of life.
Let’s start the journey togheter!

Johan Stjernberg CEO@ AiTree Tech
+46(0)733 80 08 44

Erik Silfverberg CTO@ AiTree Tech
+46(0)730 87 40 20

### The Problem

The problem that arises is that a company is not any longer capable of handling or analysing the amounts of data with its standard laptops and methods. The obvious limitations come for a hardware perspective, the data is often too large to be stored on a hard drive and most definitely cannot fit into the RAM, or the CPU is too slow and computation times grow easily to hours if not even days for even the simpler tasks. The limitations of the used software are something which quite often is overlooked. Throwing more computational power at the problem seems like the best solution, but optimizing the framework used to handle the data might be just as important, if not more, to increase performance.

#### Solving the Problem

So what do you do when you can no longer analyse the data yourself? Many companies have resorted to using cloud services. These services use a cluster of computers or servers, basically renting out hardware capabilities as the customers need it. This is a neat solution with a low threshold for individual companies and maximizes the usage of the hardware. It does not come without its drawbacks though. Data security is the first thing which comes to mind. Your confidential data needs to be uploaded to servers which are out of your control, and while the providers of cloud services are without a doubt very serious about safety, the security of the data is now out of your hands. Secondly, if you need serious hardware capabilities, such as a CUDA compatible GPU for deep learning, the prices quickly increase and grow larger than the value it brings. Lastly, to access the services and work with your data, you need connectivity. Downtime is not uncommon and in these time periods, you are not able to do your work. So what is the option? To ensure the security of your data, minimize costs, and maintain the possibility to work even if internet connectivity is down the solution is to bring the Big Data capabilities in-house. Choosing the correct hardware is, of course, an important decision and a whole topic on its own, but picking the appropriate software and frameworks is equally important, there is no use in having bridged CUDA GPUs if you cannot use them, or 100 CPU cores if you cannot support multi-threaded  computations.

#### Python Tools

Python, along with R has for many years been a very popular choice for data scientists around the world, it has many different toolboxes which makes working with data, and producing interesting results easy, plus it is not proprietary software. So if you want to use Python for your in-house server, what different Big Data tools are at your disposal? The favourable tool which is used widely nowadays is Pandas. But as we will see later, this might be about to change.

#### Pandas

Pandas is described as open sourced, high-performing, easy-to-use data structures and analysis tool for Python1. It provides ready-built functions for reading, writing, selecting, and analysing data in a compact and intuitively way.

#### Accelerator

Accelerator, an open sourced python library, is a data processing framework that provides fast data access, parallel execution and automatic organization of source code, input data and results2. This tool focuses on performance rather than simplicity and user-friendliness. You do not need to be a programming wizard to use it, but the ready-built functions are more sparse. Comparison of Pandas vs Accelerator So let us put the well-proven, easy-to-use tool of Pandas head to head with the performance-driven
Accelerator. The first thing you need to do in any project with data is to read it from the disk. Pandas supports reading of several different formats, but csv and excel files are the most common. Accelerator has its own  dataset class. We create some dummy data containing two variables and n rows. We use the built-in function to read csv file in Pandas, and the Dataset class in Accelerator, once loaded, we iterate over the data once.

Figure 1: Reading data from disk and iterating over all rows

We can see that as the size of the dataset grows, Pandas is starting to take a lot more time compared to Accelerator, note the logarithmic scales on both axes. Reading from disk is notoriously slow, so let us ignore the reading time for Pandas and look at just the time to iterate over the rows. Accelerator is made for super fast reading and writing to disk, and does not load all the data into memory in the same way as Pandas, therefore we will keep the time reported for Accelerator as the time it takes to both read
the data and perform iterating over the rows.

1https://pandas.pydata.org/
2https://www.ebayinc.com/stories/blogs/tech/announcing-the-accelerator-processing-1-000-000-000-lines-per-secondon-
a-single-computer/ Figure 2: Pandas: Iterate over all rows, Accelerator: Read data and iterate over all rows

We see that the time it takes to only iterate over the rows is less than read plus iterating for Pandas, but Accelerator is still able to both read the data and iterate over all rows faster than Pandas with growing data size. Let us have a look at some of the most commonly used statistics which has built-in functions in Pandas, starting with the summation of all rows. Here we calculate the sum of all rows for the two variables in the dataset. We will have a look at both the total time for reading data and calculating the sum and only the calculating the sum part for Pandas, while for Accelerator we only report the time for reading data plus calculating the sum to get an idea of how the built-in functions of Pandas perform alone, and how it would compare in a more real-life situation where the data has to be loaded as well.

(a) Pandas: Read data and sum of two variables, Accelerator: Read data and sum two variables (b) Pandas: Sum of two variables, Accelerator:
Read data and sum two variables

Figure 3: Summation of variables

In this case, Pandas is actually faster (even if just slightly) even with one billion rows of data when not taking into account reading the data. If the data has to be read from disk first, Accelerator is still much faster for larger datasets. Let’s look at two other built-in functions of Pandas. First, selecting or filtering of rows, in this test we select all rows where the first variable is larger than a set value and discard all other rows. In the second test, we multiply the two existing variables and create a third variable containing the result. Again, for Pandas, we look at both the time it takes to only do the relevant calculation and reading plus performing the relevant task, and for Accelerator, we look at the time for both readings and performing the relevant calculations only.

(a) Pandas: Read data and filter data, Accelerator: Read data and filter data (b) Pandas: Filter data, Accelerator: Read data and filter data

Figure 4: Filtering of data

In these tests, Pandas is faster as well when the data has been previously loaded. Here we are able to see the high-performing aspect of Pandas. The built-in functions Pandas provides are actually fast and efficient, but in a full-scale data science project, it might still not be enough. The first issue which might make Pandas not suitable for a project with very large amounts of data is the data reading. By performing much analytics in the same script and reusing data, it is possible to minimize the number of times the data needs to be read from disk. But in reality, scripts will have to be re-run many times because of bug fixes, changes in plots etc. For readability and version control it is also undesirable to have one large script that performs multiple analytics, and as such it will be quite impossible to not have to read the data from disk often. Secondly, while the built-in functions of Pandas can do many things, they are still limited. Out in the real world, the data is rarely structured in a perfect and intuitive way, and statistics such as means and sums cannot be done in the traditional way, resulting in having to create a custom made solution which requires iteration over all the rows which as we saw is significantly slower in Pandas.

#### Final Thoughts

In these tests, we have seen that the built-in functions of Pandas are in fact fast and efficient, but reading the data from disk and manual iteration over the rows quickly becomes quite slow in comparison to Accelerator which is able to do all the different things in a more consistent time. Pandas has many strengths and will definitely remain one of the top choices when working with data in general, but for projects with datasets with more than some tens of million rows, Accelerator will provide faster  development and calculation speeds. In projects with smaller amounts of data, or pilots projects when not the entire available dataset is used, the easy-to-use Pandas framework is still preferred in general, but for the larger scale Big Data projects, Accelerator will be the obvious choice.

Who are you?
My name is Charlie Sjödin. After working for two years at a large consultancy firm I felt that something was missing. When I started looking for new opportunities and I came across Combine, where Thomas (former group manager for Control Systems Solutions Gothenburg) was lurking in the shadows. It was in April 2017 when I started working at Combine and have since worked on an assignment in connected safety at Zenuity.

You have just recently arrived back from a leave of absence. How have the past months been?
Yes, that is correct. I arrived back to Sweden in the beginning of April after a long and well needed leave of absence during which I have been travelling abroad in Southeast Asia. During the 5 months of travelling through Thailand, Laos, Indonesia, Vietnam and Cambodia I saw both beautiful nature, crazy traffic and tasted a lot of amazing food. Everything from excellent curries (and cricket-snacks) in Thailand and Cambodia, Sambal-infused dishes in Indonesia to crispy spring rolls in Vietnam.

The plan for the trip from the beginning was to have time to experience the countries as they are; their rich culture, people and nature. Unfortunately, the reality caught up to me quickly as the infrastructure was poor and travelling by public transport took much more time as anticipated. However, I was still able to visit all the countries, got to know the people and indulge in their culture, even though a bit more touristy than planned.

Some of the most memorable experiences from the trip were the karst limestone cliffs in Bai Tu Long Bay (Vietnam) and the Bayon temple in Angor area (Cambodia). Just search on the interwebs and be wowed.

With so much new experiences and time to think, is there a reflection that you find especially interesting that you want to share?
Absolutely. One reflection I made from the trip is the responsibility for us engineers in making conscious moral choices. Maybe this is not valid on on a daily basis, but in our overall contributions to the development of the world certainly plays a role. Just think about the importance of the complete lifecycle of a product. A main issue I saw during my trip was the negative impact of neglecting the lifecycle of different products and its impact on nature, where plastic waste is undoubtedly one of the more extreme example. Also the accident of the Boeing 737 plane, crashing in Indonesia, illustrates the important role of an engineer for delivering safe and well-tested software black on white.

So, you are back with new experiences. Where do you see yourself taking the next step in your professional life?
As a newly enlightened environmentalist, I would like to contribute to the sustainability of this world. Whether this will be within the automotive industry or another industry, is for now shrouded in mystery as we are looking for a new assignment.

The history of Sympathy for Data
The founders behind the Platform were two employees at Volvo Cars, Stefan Larsson and Krister Johansson.

In August 2009 Stefan Larsson started writing the first prototype. The prototype was presented to his colleague Krister Johansson, a technical expert in statistics, who also worked at Volvo Cars. Krister immediately realized the potential of the concept and started to assist in finishing the prototype.

By December 2009 they came to the insight that it wouldn’t be possible to finish the software while still having a day time job. They decided to ask for permission to continue work during work hours under the condition that the ownership of the software was put in a non-profit organization and was published using an open source license. Krister and Stefan continued programming and started preparing for the founding of the non-profit organization.

In May 2010 Krister and Stefan set up a meeting with Combine. Combine became interested in the concept and in December 2010 the non-profit organization “System Engineering Software Society” (SysESS) was founded.

Since then Combine has led the development of Sympathy, with Erik der Hagopian as Product Owner, with the purpose of providing an open source solution in the domain of Data Science.

The intention in the long run
Combines intention is to continue licensing Sympathy for Data as an open-source tool. However, add-on products such as cloud services, cluster support, etc will not be licensed under an open-source license. Neither will we try to build another BI-tool, but continue to develop the kind of functionality that makes Sympathy great.

Future functionality
We see the need of developing such functionality as streaming support, cluster support, cloud services, the ability to handle huge amounts of data using a better/smarter backend as well as a need to focus more on user experience.

The next chapter
We now start the next chapter for Sympathy where we aim to increase the development pace of functionality mentioned above as well as focus on Machine Learning, Deep Learning, Image Processing, etc.

Enter the Next Level!
Erik Silfverberg
CEO, Combine Control Systems AB

Shaping the future
The environmental impact of electric cars, and to be more specific, the batteries used in those cars are constantly under scrutiny. One possible solution is the reuse of vehicle batteries, often called second-life. The concept enables repurposing batteries that no longer meet the requirements of their intended vehicle application but still fulfil those of other applications.

Circular economy
The battery of an electric vehicle is today considered to have reached its “end-of-life” when its capacity decreased by approximately 20-30% compared to the initial value, which means that 70-80% of the capacity remains.

The problem today is to determine if the battery could be reused or if it should go to recycling?

Once we can determine the “end-of-life”, the batteries could easily go into the concept of circular economy, minimizing the use of our precious resources. By utilizing circular economy we can reduce cost, for example by re-selling batteries in an energy storage system or recycling them for reuse in a battery factory.

AiTree Tech will provide a data driven machine learning solution. The aim is to predict the (complete) lifetime of lithium-ion batteries from first life to end of life.

The solution will enable customers to:

• understand the scenario of re-manufacturing
• analyse data and optimise the battery size depending of its usage
• get a better understanding of the lifetime (1st, 2nd life, etc.)
• get ID tracking for reporting to authorities

The company will be up and running during 2019.
Let’s contribute to save the world!

Erik Silfverberg
CEO, Combine Control Systems AB

The engineering needed for controlling a missile is comprised of many separate fields: control theory, aerodynamics, propulsion, material science, etc… Here only control theory is discussed, and only a subset of that.

A typical architecture for missile guidance and control can be described as follows:

Figure 1. Missile guidance and control.

The target state is measured by sensor(s). This measurement together with the missile state is fed into the missile control system. This system can be split into two parts, guidance and autopilot. The guidance portion determines what the optimal maneuver is for the missile. The autopilot performs that maneuver by controlling the missile, typically with control surfaces such as rudders. In this discussion the guidance is considered, i.e. how should the missile maneuver in an optimal way. The sensors, autopilot and missile dynamics are assumed to be ideal; without latency, noise or other issues.

# Guidance principles and strategies

Event though there exist many different strategies for missile guidance they can be dived into two major ones to consider when designing modern missiles, Proportional Navigation and Command-to-Line-Of-Sight.

Proportional Navigation (PN) is a guidance law that exploits the fact that two vehicles that have constant Line-of-Sight (LOS) to each other are on a collision course. In other words, if the LOS to the target does not rotate seen from the missile, it is on an intercept course. This has been known in shipping for hundreds of years and is used to avoid collisions between ships.

PN tries to achieve a constant LOS angle by accelerating the missile towards the rotation of the LOS, and thereby eliminating the rotation. Basically, the assumption is that the best guess on future target trajectory is that it will continue its current course.

The guidance law in its simplest form can be described as:

$$a_m =N\dot{\lambda} |V_c|$$

$$a_m$$: Commanded acceleration perpendicular to LOS
$$\dot{\lambda}$$: rotation of LOS
$$|V_c|$$: closing speed of missile relative to target

In a missile with an onboard target tracking system, a camera or an IR-sensor,  is easily available.   can in some cases be measured but is often approximated. Since the missile should have much higher speed than the target a good approximation can be missiles speed.

N is a design parameter, typically in the range 3-6 and always >2. A high value guides the missile on to an intercept course faster but requires higher accelerations and makes the missile more sensitive to noise. In practice it is common to have a varying value of N, a low value at launch and higher when closer to the target.

Note that the range to the target is not needed. This is an important property that has contributed to the widespread use of this principle.

The above guidance law cannot guarantee interception against a maneuvering target, that requires an extension of the guidance law where the target acceleration is considered, called augmented proportional navigation (APN):

$$a_m=N\dot{\lambda}|V_c|+\frac{Na_{\bot}}{2}$$
$$a_{\bot}$$:target acceleration normal to the LOS

The acceleration is seldom possible to measure by a sensor and must be estimated, which can be difficult and create high noise. Note also that the required acceleration by the missile is proportional to N. This means that choosing a more responsive missile, i.e. high N, requires more acceleration from the missile.

## Command-to-Line-Of-Sight

Command-to-Line-Of-Sight (CLOS) works by keeping the missile on the line seen from the sight towards the target. If the missile is closing on the target it will eventually intercept its path regardless of the target range.

Figure 3. Command-to-Line-of-sight. The missile is kept on the line between the sight and the target.

The resulting flight path is one where the missile leads the target more and more the closer it is to intercept. This can be seen in Figure 3 as an increasing angle between the LOS and missile velocity vector.

At launch the missile flies straight at the target and near intercept the missile matches the targets angular velocity. This is intuitively a good strategy: at launch it is hard to predict where the target is heading and how it will maneuver in the future, so a good guess is its current direction. But closer to intercept, the target has little time left to maneuver and the intercept point can be predicted, i.e. use full lead angle.

Note that this guidance principle also doesn’t require the range to the target. All that is needed is some way to measure the missiles position relative to the LOS as seen from the sight. Also, the missile is guaranteed to intercept the target if kept on the LOS, regardless of the target maneuver. The missile does not, in theory, need to maneuver more than the target, which is the case for PN and APN.

## Other guidance principles

There exist other principles than PN and CLOS. Pure Pursuit is a principle that has been used. Pursuit works by just pointing the missile velocity vector straight at the target. If the target is not stationary, or very close to stationary, then this will result in a missile trajectory that requires very high, in theory infinite, accelerations close to the target. The principle is chosen where simplicity is more important than performance, i.e. against targets with negligible movements.

## Simulation

The properties of the principles can be examined by some simulation examples. In the simulations the missile has constant speed which is unrealistic, but it helps to clearly show the properties of the guidance principles. The missile is initialized with a velocity pointing straight at the target.

In the first scenario the target is travelling with a constant velocity, from right to left. CLOS and PN results in the following missile trajectories.

In Figure 4 it is clear that at launch PN accelerates towards a straight intercept course. In comparison, CLOS does not accelerate as much at launch but the trajectory requires more acceleration closer to intercept with the target. For a target with a constant velocity PN generally travel a shorter distance.

In the next simulation the target starts with a constant velocity (right to left), but will after some time do a 90° maneuver, and then continue with a constant velocity.

At launch and until the target maneuver (#1 – #2) the missiles behave as the previous example. PN maneuvers towards an intercept course assuming the target will continue with a constant velocity. This results in a greater course change for PN during and after the target maneuver (#2 – #3) since the new predicted intercept course has moved. CLOS on the other hand did not fully commit to the intercept course before the target maneuver and requires smaller course corrections after the target maneuver.

Note also that PN is not guaranteed to intercept the target during its maneuver, but CLOS is (in theory).

Both CLOS and extended PN are useful as guidance principles. Which one that’s optimal is, as always, a matter of how “optimal” is defined. CLOS is in practice only used for shorter ranges since the target must be seen by the sight at all times.
Missiles using PN typically has the angular measuring sensor in the missile which gives increasing accuracy and precision when closing on the target. CLOS has the sensor in the sight which requires a better sensor to achieve an acceptable performance, because of the longer range to target. However, a sensor in the missile needs to be small, cheap, and disposable, whereas a sensor in the sight can be designed with less compromises.

In practice the type of guidance principle is chosen with a number of considerations. Considerations such as: what is the kinematic performance of the missile? What are the intended targets? What kind of sensors are available?

# Path optimization in nature

In nature there exist several techniques used by predators to pursue their prey. The optimal strategies can vary depending on the goal. As for missile guidance, the chance to catch a prey can be optimized but there can also be other optimization variables. Animals have evolved to detect motion, predators can therefore try to minimize their movement against the perceived background to limit the available reaction time the prey has until it is caught (Zamani & Amador Kane, 2014; Mizutani, Chahl, & Srinivasan , 2003).

These techniques that have been shown to be used in nature are very similar to modern missile guidance laws[1].

When the background, e.g. trees and bushes, is sufficiently close, minimizing movement against that background is accomplished by staying on the line between a landmark and the prey. This is of course very similar to CLOS, where the sight is exactly behind the missile as seen from the target.

When the background is far away, as the sky is for birds attacking from above, the strategy instead becomes “Parallel Navigation” where the line between the predator and prey has a constant bearing.

The same strategy has also been seen with bats, where they keep a constant bearing towards their prey. But since bats hunt at night this is not to avoid detection but rather because it is an efficient strategy, and quite close to optimal for catching erratically moving insects  (Ghose, Horiuchi, Krishnaprasad, & Moss, 2006). Bats and their interaction with prey is interesting in many aspects. The echolocating sonar they use has made some of their prey evolve countermeasures against it, where they emit sound to “jam” the sonar. This has then caused the bats to evolve a more complex sonar to counteract the jamming. Compare this to the military techniques of ECM and ECCM.

# Conclusion

Choosing the strategy for guiding an object to intercept another object is an interesting engineering problem. A theoretical analysis is useful to show the practical application. Choosing the “optimal” principle is only possible if there are stated goals and requirements, as well as knowing prerequisites and limitations.

Choosing PN as a guidance principle can seem to be optimal if looking at the problem with some specific conditions, such as having a target traveling with constant velocity. But when the target maneuvers CLOS seems to be a better choice. But PN can be modified to Augmented PN, where it may give better performance.

The analysis can show the basic properties of the principles, but the final choice needs to consider all aspects, including such things as development cost and time. But this is all part of engineering!

# References

Armstrong, R. E., Drapeau, M. D., Loeb, C. A., & Valdes, J. J. (2010). Bio-inspired Innovation and National Security.

Ghose, K., Horiuchi, T. K., Krishnaprasad, P. S., & Moss, C. F. (2006). Echolocating Bats Use a Nearly Time-Optimal Strategy to Intercept Prey. PLOS Biology.

Mizutani, A., Chahl, J. S., & Srinivasan , M. V. (2003). Motion camouflage in dragonflies. Nature.

Zamani, M., & Amador Kane, S. (2014). Falcons pursue prey using visual motion cues: new perspectives from animal-borne cameras. Journal of Experimental Biology.

Zarchan, P. (2013). Tactical and Strategic Missile Guidance.

[1] https://www.newscientist.com/article/dn3870-dragonfly-trick-makes-missiles-harder-to-dodge/

March came by, and brought about another PyCon event, this time in Bratislava. The range of topics presented was wide, as usual, spanning things like robotics, machine learning, operations, but also some of the more social aspects of software engineering, software projects in government, or even an entire track focused on education. Of course, we did not want to miss it and are happy to share some of the highlights here.

The opening talk of the conference was about Anvil, a platform making it possible to build interactive web applications entirely with Python, without having to deal with concepts like CSS, HTML, Javascript, or SQL. It seems like a really neat tool, especially when building an application with limited resources. On the other hand, it is not an open platform, which would make lock-in a concern.

Anton Caceres shared some of his insights on architectures based around micro-services, which has been a very popular trend this decade. An important point was that even when using micro-services, it is beneficial to share the same base stack among all of the services, such as the language, frameworks, discovery mechanisms, failover, etc. In that case they only need to be maintained once, rather than for each specific flavor separately. He also presented a number of common patterns, such as sidecar containers, ambassador, or a pattern which combines a service registry with a side-car to keep all services informed about each other.

The first day was wrapped up with a talk by Miroslav Šedivý, who went on a deep dive into tzdata, the time zone database, sometimes referred to as the Olson database, which aims to be a complete compilation of all the information about the world’s time zones since 1970. Among other things we learned that Czechoslovakia was the only country which had, in addition to the standard time, not just summer time, but also a third winter time one year, and that both the Czech Republic, and Slovakia have inherited the law which makes it possible to declare a winter time again. The key takeaway from this talk was that whenever you are dealing with time zones, it is of utterly importance to use a library that is based on tzdata, such as pytz, since dealing with the ever-changing definitions of all the world’s time zones on your own is simply not feasible.

On Saturday, a keynote talk was given by Honza Král, a former core developer of Django, in which he shared his insight on what skills are necessary in order to be a good software engineer. It is very common for people in technical fields, like data science, software engineering, or information security, to think technical skills are the key to success. This is also reinforced by the usual framing of “soft” vs. “hard” skills, which makes it easy for us to downplay the importance of the latter. After all, “soft” implies “fuzzy”, “non-exact”, and that is antithetical to the perceived exact nature of the field of software development.

However, we can implement the most brilliant piece software, and it will not be worth much if we cannot explain that fact to other people in a polite, efficient way, and collaborate with each other. That is why it has been suggested to change the labels we apply to the different skills, like, for example, technical, and professional skills.

Next up, Ján Suchal, and Gabriel Lachmann gave a talk that was of particular interest to the Slovak audience members. The topic was IT projects in the Slovak government. For decades, the modus operandi was that the majority of government IT projects were defined in such a way, that there was exactly one supplier who could fulfill all the requirements, usually one with ties to people sitting in the government organization making the order. As a result, the typical project was way overpriced, delayed ad infinitum, and would rarely produce any usable result.

That is why several years ago, a group of professionals, who were tired of this, founded an NGO called Slovensko.Digital. They are lobbying to open up the processes, pushing for open access to data and government platforms, and highlighting any shady practices going on within the world of government IT. Ján and Gabriel presented their vision, some of their recent successes, and how members of the public can get involved. While the current situation is still far from perfect, things have improved somewhat over the past years, and there is yet hope for the Slovak government.

On Sunday, one of the speakers could not make it to the conference, so in order to fill the hole in the schedule, the organizers played back a recording of Kenneth Reitz’s talk from PyCon US 2018 about Pipenv. This was a very useful introduction for those of us in the audience who never took the time to look into Pipenv. This tool automates the tasks of keeping a list of direct dependencies, a list of all pinned transitive dependencies, and an up-to-date Python virtualenv. It is really nice how adding a new dependency, while maintaining all of the above, only takes a single short command. Not to mention that it also includes other bells and whistles, such as sanity checks that all direct dependencies are reflected in the pins to prevent deployments using inconsistent state, or automatic checks of dependencies against known security vulnerabilities.

As one of the last talks of the conference, Ingrid Budau gave us an introduction to pandas, a popular library often used in data science, and in machine learning to manipulate large data sets. She walked us through the basics of importing a dataset, the data types that pandas recognizes, and how to work with variants. Then Ingrid moved on to show us how pandas can be used to detect malformed input data by looking at rows with shifted values, how to deal with that, or how to fill in missing data.