Jekyll2019-03-17T16:25:48+00:00/blog/feed.xmlAndWass C++Write an awesome description for your new site here. You can edit this line in _config.yml. It will appear in your document head meta (for Google search results) and in your feed.xml site description.Exploring a driver concept2019-03-17T00:00:00+00:002019-03-17T00:00:00+00:00/blog/2019/03/17/exploring-a-driver-concept<p>As I have mentioned before, it is common for embedded drivers to be
tightly coupled with the current hardware implementation. For instance
a driver for an I2C RTC circuit such as <a href="https://www.nxp.com/products/analog/interfaces/ic-bus/ic-real-time-clocks-rtc/tiny-real-time-clock-calendar-with-alarm-function-battery-switch-over-time-stamp-input-and-ic-bus:PCF85263A">NXP PCF85263</a> together with an Atmel microcontroller will most likely use Atmels
framework (ASF for instance) directly to handle I2C communication. Now if
the microcontroller is switched to some other brand, you will have to change the
RTC driver to match the I2C driver supplied by the new brand of microcontrollers.</p>
<p>I have made these exact mistakes myself, and sadly I will probably continue to make
these mistakes in the future as well. Now I don’t like to make excuses but I can’t
help but feel that C, which is what I have done most embedded software with, isn’t
helping.</p>
<h2 id="better-drivers">Better drivers</h2>
<p>The tight coupling is a problem. When I develop solutions to other problems, and when
thinking of system architecture, I often
think along the lines of <em>“this system shouldn’t care how that system operates, it should only care
about the inputs and outputs”</em>. The same should go for drivers! If I have an I2C RTC
circuit, the driver for the RTC circuit shouldn’t care how the I2C communication is done;
it could be manually bitbanged, or use efficient DMA access, or anything else really. All
it should care about is that it can communicate with the circuite via an I2C bus.</p>
<p>Good drivers should also be composable. It is common for newer microcontrollers to have some
sort of RTC builtin already. However it is also common for this RTC to require more power
to continue its timekeeping when the rest of the system is off. External RTCs are very power
efficient, but require some sort of buscommunication to use. I want to see a future
where the drivers for these two RTCs easily can be combined to create a synchronized RTC.
Reading time would <em>usually</em> fetch time from the internal RTC, and thus be fast, but every
now and then the time would be fetched and synchronized with the external RTC.</p>
<p>Another example would be to be able to combine two (or more) storage devices
to create a storage with the combined capacity of all its
child devices, and <em>where the underlying storage technology doesn’t matter</em>.
This should then be usable in the FAT32 driver and so on.</p>
<h2 id="polymorphism">Polymorphism</h2>
<p>A common approach in C is to have a <code class="highlighter-rouge">struct</code> with function pointers. The
<code class="highlighter-rouge">struct</code> provides a unified interface to interact with the driver. The
drivers in the Linux kernel are written using this technique, and Atmels ASF
also uses this approach. However this is pretty much a hand-coded version of
a C++ abstract class. In C you would have</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">i2c_chip_interface</span>
<span class="p">{</span>
<span class="kt">void</span> <span class="o">*</span><span class="n">interface_data</span><span class="p">;</span>
<span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="n">init</span><span class="p">)(</span><span class="kt">void</span><span class="o">*</span><span class="p">);</span> <span class="c1">// interface_data sent as argument</span>
<span class="kt">void</span> <span class="p">(</span><span class="o">*</span><span class="n">transmit</span><span class="p">)(</span><span class="kt">void</span><span class="o">*</span><span class="p">,</span> <span class="kt">uint8_t</span><span class="o">*</span><span class="p">,</span> <span class="kt">size_t</span><span class="p">);</span>
<span class="p">};</span>
<span class="c1">// Create instances</span>
<span class="n">i2c_chip_interface</span> <span class="n">make_atmel_i2c</span><span class="p">();</span>
<span class="n">i2c_chip_interface</span> <span class="n">make_st_i2c</span><span class="p">();</span>
<span class="c1">// Use instances</span>
<span class="kt">void</span> <span class="n">rtc_read</span><span class="p">(</span><span class="n">i2c_chip_interface</span><span class="o">*</span> <span class="n">iface</span><span class="p">);</span>
</code></pre></div></div>
<p>and in C++ the equivalent would be</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">i2c_chip_interface</span>
<span class="p">{</span>
<span class="k">virtual</span> <span class="o">~</span><span class="n">i2c_chip_interface</span><span class="p">()</span> <span class="o">=</span> <span class="k">default</span><span class="p">;</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">init</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">virtual</span> <span class="n">transmit</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="n">span</span><span class="o"><</span><span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span><span class="o">></span> <span class="n">to_write</span><span class="p">)</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">};</span>
<span class="c1">// Implement interface</span>
<span class="k">struct</span> <span class="n">atmel_i2c</span><span class="o">:</span> <span class="k">public</span> <span class="n">i2c_chip_interface</span>
<span class="p">{</span>
<span class="c1">// Implement methods...</span>
<span class="p">};</span>
<span class="k">struct</span> <span class="n">st_i2c</span><span class="o">:</span> <span class="k">public</span> <span class="n">i2c_chip_interface</span>
<span class="p">{</span>
<span class="c1">// Implement methods...</span>
<span class="p">};</span>
<span class="c1">// Use them</span>
<span class="kt">void</span> <span class="n">rtc_read</span><span class="p">(</span><span class="n">i2c_chip_interface</span><span class="o">&</span> <span class="n">iface</span><span class="p">);</span>
</code></pre></div></div>
<p>This is the classical way of decoupling interface and implementation, and
I started using this approach when I wrote my first basic drivers. It
seemed right at the time. I don’t have much experience writing generic code and this
closesly mimics the more familiar C driver style.</p>
<h3 id="not-all-roses">Not all roses</h3>
<p>There are a couple of issues with these approaches though. Performance is
likely to take a hit since the compiler can’t easily inline functions as
reliably as if a direct call is made. This might not make much of a difference on
a 2GHz desktop but it can make a big difference on an embedded target
running at 4MHz and you need to implement a manually bitbanged SPI-bus.
While this is a bit of a premature optimization, in a hot code path these
things may actually matter.</p>
<p>Another problem is that of ownership. In a normal program you would
dynamically allocate the implementation and then transfer
ownership using smart pointers to the abstract class.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">auto</span> <span class="n">clk_pin</span><span class="p">{</span><span class="n">std</span><span class="o">::</span><span class="n">make_unique</span><span class="o"><</span><span class="n">atmel_pin</span><span class="o">></span><span class="p">(</span><span class="mi">1</span><span class="p">)};</span>
<span class="c1">// Create miso and mosi pin aswell</span>
<span class="c1">// "transfer ownership of pins to the software SPI</span>
<span class="n">sw_spi</span> <span class="n">spi</span><span class="p">{</span><span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">clk_pin</span><span class="p">),</span> <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">miso_pin</span><span class="p">),</span> <span class="n">std</span><span class="o">::</span><span class="n">move</span><span class="p">(</span><span class="n">mosi_pin</span><span class="p">)};</span>
</code></pre></div></div>
<p>Embedded targets may not have a heap though,
so you cannot assume that dynamic memory exists at all!
The actual ownership of an implementation must live
outside of the object that implicitly owns the resource.</p>
<p>Consider a manually bitbanged SPI bus. The bus itself requires 3 GPIO pins;
a clock, master out slave in (MOSI) and master in slave out (MISO).
A basic implementation may look something like</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">sw_spi</span>
<span class="p">{</span>
<span class="c1">// gpio_pin is an abstract class that provides</span>
<span class="c1">// pin functionality, like setting the output value,</span>
<span class="c1">// reading the current value of the pin etc.</span>
<span class="n">gpio_pin</span> <span class="o">&</span><span class="n">clk_</span><span class="p">;</span>
<span class="n">gpio_pin</span> <span class="o">&</span><span class="n">mosi_</span><span class="p">;</span>
<span class="n">gpio_pin</span> <span class="o">&</span><span class="n">miso_</span><span class="p">;</span>
<span class="kt">void</span> <span class="nf">write</span><span class="p">(</span><span class="n">span</span><span class="o"><</span><span class="kt">uint8_t</span><span class="o">></span> <span class="n">data</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// write data using the pins</span>
<span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>
<p>Since <code class="highlighter-rouge">gpio_pin</code> is abstract we cannot store the pins as values,
we must store either pointers or references. But the pins are logically
owned by the SPI bus. Any usages outside the bus is most likely an error.
Any destruction of the pins while the SPI implementation lives is definately
an error. In some regards the C implementation can actually work around this.
I won’t say its better, I won’t say it is recommended, I will say that the
classic C++ way has limitations.</p>
<h2 id="templates-to-the-rescue">Templates to the rescue?</h2>
<p>Templates are a way of doing compile-time polymorphism. With templates
we are no longer restricted to storing pointers or references, so the
above ownership issues can be fixed.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="k">class</span> <span class="nc">ClkPin</span><span class="p">,</span> <span class="k">class</span> <span class="nc">MosiPin</span><span class="p">,</span> <span class="k">class</span> <span class="nc">MisoPin</span><span class="o">></span>
<span class="k">class</span> <span class="nc">sw_spi</span>
<span class="p">{</span>
<span class="n">ClkPin</span> <span class="n">clk_</span><span class="p">;</span>
<span class="n">MosiPin</span> <span class="n">mosi_</span><span class="p">;</span>
<span class="n">MisoPin</span> <span class="n">miso_</span><span class="p">;</span>
<span class="c1">// Rest of implementation</span>
<span class="p">};</span>
</code></pre></div></div>
<p>We don’t store references to data implicitly owned by
the SPI implementation. All is actually owned by
the class. Since the compiler knows which actual pin
implementation is used for the various pins we are much
more likely to inline code that you would normally expect,
so performance may actually be better as well.</p>
<p>One of the biggest headaches I have had with templates is that
it has been very difficult to specify an interface for a type.
All the pins in the example above <em>should</em> provide the same <em>functionality</em>
even if the types aren’t exactly the same, but to me there haven’t been
any straightforward ways to express that.</p>
<h2 id="concepts">Concepts</h2>
<p>Concepts is an upcoming feature in C++20. It allows us to express
constraints on template arguments. I think of them as a functionality
specification for templates, and with this we can easily describe what
functionality is expected of a template argument.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">namespace</span> <span class="n">gpio</span>
<span class="p">{</span>
<span class="c1">// Note that the syntax may or may not be 100% according</span>
<span class="c1">// to what will be in the standard. This is what ARM GCC offered</span>
<span class="c1">// at the time of writing this post.</span>
<span class="k">template</span><span class="o"><</span><span class="k">class</span> <span class="nc">T</span><span class="o">></span>
<span class="n">concept</span> <span class="kt">bool</span> <span class="n">GpioPin</span> <span class="o">=</span> <span class="n">requires</span> <span class="p">(</span><span class="n">T</span> <span class="n">t</span><span class="p">)</span> <span class="p">{</span>
<span class="p">{</span> <span class="n">t</span><span class="p">.</span><span class="n">configure_direction</span><span class="p">(</span><span class="n">pin</span><span class="o">::</span><span class="n">direction</span><span class="p">{})</span> <span class="p">};</span>
<span class="p">{</span> <span class="n">t</span><span class="p">.</span><span class="n">configure_pull</span><span class="p">(</span><span class="n">pin</span><span class="o">::</span><span class="n">pull_config</span><span class="p">{})</span> <span class="p">};</span>
<span class="p">{</span> <span class="n">t</span><span class="p">.</span><span class="n">set_value</span><span class="p">(</span><span class="kt">bool</span><span class="p">{})</span> <span class="p">};</span>
<span class="p">{</span> <span class="n">t</span><span class="p">.</span><span class="n">value</span><span class="p">()</span> <span class="p">}</span> <span class="o">-></span> <span class="kt">bool</span><span class="p">;</span>
<span class="p">{</span> <span class="n">t</span><span class="p">.</span><span class="n">toggle</span><span class="p">()</span> <span class="p">};</span>
<span class="p">};</span>
<span class="p">}</span>
<span class="k">template</span><span class="o"><</span><span class="n">gpio</span><span class="o">::</span><span class="n">GpioPin</span> <span class="n">ClkPin</span><span class="p">,</span>
<span class="n">gpio</span><span class="o">::</span><span class="n">GpioPin</span> <span class="n">MosiPin</span><span class="p">,</span>
<span class="n">gpio</span><span class="o">::</span><span class="n">GpioPin</span> <span class="n">MisoPin</span><span class="o">></span>
<span class="k">class</span> <span class="nc">sw_spi</span>
<span class="p">{</span>
<span class="n">ClkPin</span> <span class="n">clk_</span><span class="p">;</span>
<span class="n">MosiPin</span> <span class="n">mosi_</span><span class="p">;</span>
<span class="n">MisoPin</span> <span class="n">miso_</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>
<p>Now any class that satisfies the GpioPin concept can
be used with <code class="highlighter-rouge">sw_spi</code>, and if we have a conecpt for
a SPI bus, <code class="highlighter-rouge">sw_spi</code> can be used everywhere a SPI bus
is expected.</p>
<p>I haven’t used this way of thinking alot yet, but I really like what
I have used it for so far. One of the benefits I think is that
you frame your mindset to think of drivers as composable and reusable
types, instead of implementation details on top of some interface (if that makes sense).</p>
<p>I can see some bumps along the road ahead, but nothing that can’t be overcome.
Will concepts actually become templates for the masses? I am excited to see
how this feature will work in the embedded space, and it is my hope that I will
one day be able to write code like</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Make one storage region out of 2 flash devices</span>
<span class="k">auto</span> <span class="n">raw_storage</span> <span class="o">=</span> <span class="n">storage</span><span class="o">::</span><span class="n">join_storage</span><span class="p">(</span><span class="n">flash_device1</span><span class="p">,</span> <span class="n">flash_device2</span><span class="p">);</span>
<span class="c1">// Reserve the first 1024 bytes for meta data</span>
<span class="k">auto</span> <span class="p">[</span><span class="n">meta_region</span><span class="p">,</span> <span class="n">fs_region</span><span class="p">]</span> <span class="o">=</span> <span class="n">storage</span><span class="o">::</span><span class="n">split</span><span class="p">(</span><span class="n">raw_storage</span><span class="p">,</span> <span class="mi">1024</span><span class="p">);</span>
<span class="c1">// The rest of available space is a FAT file system</span>
<span class="k">auto</span> <span class="n">fat32</span> <span class="o">=</span> <span class="n">fs</span><span class="o">::</span><span class="n">make_fat32</span><span class="p">(</span><span class="n">fs_region</span><span class="p">);</span>
</code></pre></div></div>
<p>or compose drivers in other creative ways.</p>
<p>I have decided to ditch the classical runtime polymorphism
during my embedded C++ development. I will
actually try to use concepts alot more instead. I will for
now stay in wonderland and try to discover how far down the
rabbit hole I can go.</p>
<p>If there are any errors, or if I have missed something I would
like to know! I am not in any way, shape or form an expert.
Not when it comes to programming or programming concepts.</p>As I have mentioned before, it is common for embedded drivers to be tightly coupled with the current hardware implementation. For instance a driver for an I2C RTC circuit such as NXP PCF85263 together with an Atmel microcontroller will most likely use Atmels framework (ASF for instance) directly to handle I2C communication. Now if the microcontroller is switched to some other brand, you will have to change the RTC driver to match the I2C driver supplied by the new brand of microcontrollers.A basic scheduler2019-03-03T00:00:00+00:002019-03-03T00:00:00+00:00/blog/2019/03/03/basic-scheduler<p>In <em><a href="/blog/2019/02/23/multitasking-1.html">Multitasking 1 of n</a></em> I showed
how I started implementing some very basic multitasking functionality, using cooperative
multitasking. In that post there was no real logic when switching between tasks; I
just created two tasks and switched between them in the most basic manner possible. If i wanted to add a third
task I would have to change the code in multiple places and make sure things were kept in sync. There was no easy
way to support removing tasks.
I therefore chose to start implementing my first actual scheduler; a round robin scheduler.</p>
<h2 id="round-robin-scheduling">Round robin scheduling</h2>
<p><a href="https://en.wikipedia.org/wiki/Round-robin_scheduling">Round robin</a> scheduling is a fair algorithm for
task scheduling; when a task yields, the scheduler activates the next task, until all tasks have
been activated at which point the cycle is started from the beginning. This is a pretty straight-forward
algorithm, it is easy to understand how it should work and there aren’t really any tricky corner-cases to
consider.</p>
<p><img src="/blog/images/blog/round_robin.png" alt="Round robin image" /></p>
<p>An RTOS will most likely have functionality to prioritize tasks, letting tasks with higher
priority execute before tasks with lower priority, and alot of logic to handle things like
priority inheritance. These things will be implemented eventually
but right now I am in a stage where the code changes alot and even basic code structure is
constantly evolving and changing, and I will slowly let this kind of functionality evolve.</p>
<h2 id="building-blocks">Building blocks</h2>
<p>Looking at the image above we can see that we need first need to encapsulate the data of a particular task.
We also need to have some way of building a circular list of this task data. Since this is aimed at embedded
I also added a requirement that no dynamic memory shall be used, which means that no standard container is viable.</p>
<p>Now as seen before each task is described pretty much only by its stack, so a structure holding this data for each task
is logical.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">task_data</span>
<span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="o">*</span><span class="n">stack_bottom</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
<span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="o">*</span><span class="n">stack_pointer</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>I then needed some way of task data in a container, without having to rely on dynamic memory.
This lead me to make a very basic implementation that looked something like</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">round_robin</span>
<span class="p">{</span>
<span class="k">struct</span> <span class="n">task_data</span>
<span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="o">*</span><span class="n">stack_bottom</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
<span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="o">*</span><span class="n">stack_pointer</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
<span class="n">task_data</span> <span class="o">*</span><span class="n">next</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
<span class="n">task_data</span> <span class="o">*</span><span class="n">prev</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">task_data</span> <span class="o">*</span><span class="n">head_</span><span class="p">;</span>
<span class="n">task_data</span> <span class="o">*</span><span class="n">tail_</span><span class="p">;</span>
<span class="n">task_data</span> <span class="nf">create_task</span><span class="p">()</span> <span class="p">{</span>
<span class="k">return</span> <span class="p">{};</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">add_task</span><span class="p">(</span><span class="n">task_data</span> <span class="o">&</span><span class="n">task</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// Add it to my linked list</span>
<span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>
<p>The basic design goal was to allow the user to own the memory, but to still be able
to build a linked list of user-supplied nodes. Basicly I had “invented” a poorly made
intrusive list.</p>
<p>Now this worked for a short while, but soon I became frustrated because I had mixed
list-keeping logic with the scheduler-logic. I started encapsulating the list-keeping code
into its own generic class but soon got fed up and thought that there must be a solution available!</p>
<h3 id="boosting-eppos">Boosting EppOS</h3>
<p>After some googling I found that the intrusive container library in <a href="https://www.boost.org/">boost</a>
and I decided to see if their intrusive list implementation
could fit my requirements, and it did! The <code class="highlighter-rouge">boost::intrusive::list</code> implementation can be configured
to not use exceptions, but as always; reading documentation is key. It requires some modifications of the
stored datatype to make it usable which meant that the <code class="highlighter-rouge">round_robin_data</code> structure had to be adapted</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">namespace</span> <span class="n">bint</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">intrusive</span><span class="p">;</span>
<span class="k">using</span> <span class="n">intrusive_link_mode</span> <span class="o">=</span> <span class="n">bint</span><span class="o">::</span><span class="n">link_mode</span><span class="o"><</span><span class="n">bint</span><span class="o">::</span><span class="n">link_mode_type</span><span class="o">::</span><span class="n">normal_link</span><span class="o">></span><span class="p">;</span>
<span class="k">struct</span> <span class="n">task</span><span class="o">:</span> <span class="k">public</span> <span class="n">bint</span><span class="o">::</span><span class="n">list_base_hook</span><span class="o"><</span><span class="n">intrusive_link_mode</span><span class="o">></span>
<span class="p">{</span>
<span class="c1">// Task data goes here</span>
<span class="p">};</span>
</code></pre></div></div>
<p>This is all that is needed to enable a datatype to be usable with the intrusive list. Boost
also supports adding a member variable and use that as a hook instead, but i felt that this
was cleaner and easier to use with the actual list implementation. The member hook requires
A pointer to member as a template parameter.</p>
<p>One word about <code class="highlighter-rouge">link_mode_type</code>. 3 different link policies are supported: <strong>normal</strong>, <strong>safe</strong> and <strong>auto unlink</strong>.
In short, a normal link won’t set the hooks to a default state when elements are erased. Containers also won’t check the
hooks to make sure they are default initialized. Safe will set the hooks to a default state when elements are erased, and
containers will make sure they are set to default when inserting new elements. Auto unlink is the same as <em>safe</em> but also
tells containers that elements can be silently erased without going through container-provided functions.</p>
<p>From the outset auto unlink sounds good, but it comes with trade-offs. For instance they only work with <strong>non</strong>-constant time
<code class="highlighter-rouge">size()</code> containers. As always reading documentation and evaluating different aspects is key. I chose <code class="highlighter-rouge">normal_link</code> above since
that is the most performant option, but I may very well opt for using the safe option later on.</p>
<p>With this small adaptation I can use <code class="highlighter-rouge">boost::intrusive::list</code> and have a type-safe, well-tested,
performant and an all-in-all well-designed intrusive list that plays nicely with standard algorithms.
I haven’t done any code-size comparisons, I started implementing a very basic intrusive list on my own
but I soon got fed up with corner cases. This says something about the value of being able to use
an already existing algorithm, and even if I could implement a list-type of my own that was smaller in
code size, I am much more confident of the correctness of the boost implementation than I would be of my
own. EppOS will not suffer from NIH syndrome!</p>
<p>After this my <code class="highlighter-rouge">round_robin</code> structure, and some helpers looks something like</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">task_data</span>
<span class="p">{</span>
<span class="k">typedef</span> <span class="kt">void</span><span class="p">(</span><span class="o">*</span><span class="n">main_fn</span><span class="p">)();</span>
<span class="k">typedef</span> <span class="kt">void</span><span class="p">(</span><span class="o">*</span><span class="n">end_fn</span><span class="p">)();</span>
<span class="n">main_fn</span> <span class="n">main</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
<span class="n">end_fn</span> <span class="n">end</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
<span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span><span class="o">*</span> <span class="n">stack_pointer</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
<span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span><span class="o">*</span> <span class="n">stack_bottom</span> <span class="o">=</span> <span class="nb">nullptr</span><span class="p">;</span>
<span class="kt">bool</span> <span class="n">is_started</span> <span class="o">=</span> <span class="nb">false</span><span class="p">;</span>
<span class="p">};</span>
<span class="k">namespace</span> <span class="n">core</span><span class="o">::</span><span class="n">task_scheduler</span>
<span class="p">{</span>
<span class="c1">// "reset" task data. This can be done in C++ but I haven't</span>
<span class="c1">// bothered to do that just yet.</span>
<span class="kt">void</span> <span class="n">reset</span><span class="p">(</span><span class="n">task_data</span> <span class="o">&</span><span class="n">tcb</span><span class="p">)</span> <span class="p">{</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"push {r4}"</span><span class="p">);</span> <span class="c1">// Push a utility registers</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"mov r4, lr</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span> <span class="c1">// Store old return address</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"mov lr, %0</span><span class="se">\n</span><span class="s">"</span> <span class="o">::</span> <span class="s">"r"</span><span class="p">(</span><span class="n">tcb</span><span class="p">.</span><span class="n">main</span><span class="p">));</span> <span class="c1">// Temporarily make start_func lr</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"push {r3}"</span><span class="p">);</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"mov r3, %0"</span> <span class="o">::</span> <span class="s">"r"</span><span class="p">(</span><span class="n">tcb</span><span class="p">.</span><span class="n">stack_bottom</span><span class="p">));</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"stmdb r3!, {r4-r11, lr}</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span> <span class="c1">// Push using "r4" as stack</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"mov %0, r3</span><span class="se">\n</span><span class="s">"</span> <span class="o">:</span> <span class="s">"=r"</span><span class="p">(</span><span class="n">tcb</span><span class="p">.</span><span class="n">stack_pointer</span><span class="p">));</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"pop {r3}"</span><span class="p">);</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"mov lr, r4</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"pop {r4}"</span><span class="p">);</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"BX LR"</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="k">namespace</span> <span class="n">bint</span> <span class="o">=</span> <span class="n">boost</span><span class="o">::</span><span class="n">intrusive</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">round_robin</span>
<span class="p">{</span>
<span class="k">using</span> <span class="n">intrusive_link_mode</span> <span class="o">=</span> <span class="n">bint</span><span class="o">::</span><span class="n">link_mode</span><span class="o"><</span><span class="n">bint</span><span class="o">::</span><span class="n">link_mode_type</span><span class="o">::</span><span class="n">normal_link</span><span class="o">></span><span class="p">;</span>
<span class="k">struct</span> <span class="n">task</span><span class="o">:</span> <span class="k">public</span> <span class="n">bint</span><span class="o">::</span><span class="n">list_base_hook</span><span class="o"><</span><span class="n">intrusive_link_mode</span><span class="o">></span>
<span class="p">{</span>
<span class="n">task_data</span> <span class="n">tcb</span><span class="p">;</span>
<span class="p">};</span>
<span class="n">bint</span><span class="o">::</span><span class="n">list</span><span class="o"><</span><span class="n">task</span><span class="o">></span> <span class="n">task_list</span><span class="p">;</span>
<span class="k">typename</span> <span class="n">bint</span><span class="o">::</span><span class="n">list</span><span class="o"><</span><span class="n">task</span><span class="o">>::</span><span class="n">iterator</span> <span class="n">current_task</span><span class="p">;</span>
<span class="n">round_robin</span><span class="p">()</span><span class="o">:</span> <span class="n">current_task</span><span class="p">(</span><span class="n">task_list</span><span class="p">.</span><span class="n">end</span><span class="p">())</span> <span class="p">{}</span>
<span class="c1">// This function is called to create new tasks. A new task is not</span>
<span class="c1">// automatically added to the list of tasks.</span>
<span class="n">task</span> <span class="n">create_task</span><span class="p">(</span><span class="n">task_data</span><span class="o">::</span><span class="n">main_fn</span> <span class="n">start_func</span><span class="p">,</span>
<span class="n">task_data</span><span class="o">::</span><span class="n">end_fn</span> <span class="n">end_func</span><span class="p">,</span>
<span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="o">*</span><span class="n">stack_bottom</span><span class="p">)</span> <span class="p">{</span>
<span class="n">task</span> <span class="n">retval</span><span class="p">;</span>
<span class="n">retval</span><span class="p">.</span><span class="n">tcb</span><span class="p">.</span><span class="n">main</span> <span class="o">=</span> <span class="n">start_func</span><span class="p">;</span>
<span class="n">retval</span><span class="p">.</span><span class="n">tcb</span><span class="p">.</span><span class="n">end</span> <span class="o">=</span> <span class="n">end_func</span><span class="p">;</span>
<span class="n">retval</span><span class="p">.</span><span class="n">tcb</span><span class="p">.</span><span class="n">stack_bottom</span> <span class="o">=</span> <span class="n">stack_bottom</span><span class="p">;</span>
<span class="n">core</span><span class="o">::</span><span class="n">task_scheduler</span><span class="o">::</span><span class="n">reset</span><span class="p">(</span><span class="n">retval</span><span class="p">.</span><span class="n">tcb</span><span class="p">);</span>
<span class="k">return</span> <span class="n">retval</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="n">add_task</span><span class="p">(</span><span class="n">task</span> <span class="o">&</span><span class="n">task</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// Add it to my linked list</span>
<span class="n">task_list</span><span class="p">.</span><span class="n">push_back</span><span class="p">(</span><span class="n">task</span><span class="p">);</span>
<span class="k">if</span><span class="p">(</span><span class="n">current_task</span> <span class="o">==</span> <span class="n">task_list</span><span class="p">.</span><span class="n">end</span><span class="p">())</span> <span class="p">{</span>
<span class="n">current_task</span> <span class="o">=</span> <span class="n">task_list</span><span class="p">.</span><span class="n">begin</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="c1">// Requires that we have tasks, but in a kernel</span>
<span class="c1">// there will always be at least 1 task present; the kernel</span>
<span class="c1">// idle task.</span>
<span class="n">task</span><span class="o">&</span> <span class="n">next</span><span class="p">()</span> <span class="p">{</span>
<span class="n">current_task</span><span class="o">++</span><span class="p">;</span>
<span class="c1">// wraparound logic.</span>
<span class="k">if</span><span class="p">(</span><span class="n">current_task</span> <span class="o">==</span> <span class="n">task_list</span><span class="p">.</span><span class="n">end</span><span class="p">())</span> <span class="p">{</span>
<span class="n">current_task</span> <span class="o">=</span> <span class="n">task_list</span><span class="p">.</span><span class="n">begin</span><span class="p">();</span>
<span class="p">}</span>
<span class="k">return</span> <span class="o">*</span><span class="n">current_task</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">task</span><span class="o">&</span> <span class="n">current</span><span class="p">()</span> <span class="p">{</span>
<span class="k">return</span> <span class="o">*</span><span class="n">current_task</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>
<p>From here on everything just fell in to place. The old <code class="highlighter-rouge">yield</code> function only required minor
modifications.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// This is the new yield</span>
<span class="kt">void</span> <span class="nf">__attribute__</span><span class="p">((</span><span class="n">noinline</span><span class="p">,</span> <span class="kr">naked</span><span class="p">))</span> <span class="n">yield</span><span class="p">()</span> <span class="p">{</span>
<span class="c1">// Store current register contexts to the current stack</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"push {R4-R11, LR}"</span><span class="p">);</span>
<span class="c1">// Store the current context stack pointer</span>
<span class="k">auto</span> <span class="o">&</span><span class="n">current</span> <span class="o">=</span> <span class="n">rr</span><span class="p">.</span><span class="n">current</span><span class="p">();</span> <span class="c1">// <- Get the current task</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"mov %0, sp"</span> <span class="o">:</span><span class="s">"=r"</span><span class="p">(</span><span class="n">current</span><span class="p">.</span><span class="n">tcb</span><span class="p">.</span><span class="n">stack_pointer</span><span class="p">));</span>
<span class="c1">// Calculate which stack to move to</span>
<span class="k">auto</span> <span class="o">&</span><span class="n">next</span> <span class="o">=</span> <span class="n">rr</span><span class="p">.</span><span class="n">next</span><span class="p">();</span> <span class="c1">// <- Get the next task from round_robin scheduler</span>
<span class="c1">// Load the new stack pointer</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"mov sp, %0"</span> <span class="o">::</span> <span class="s">"r"</span><span class="p">(</span><span class="n">next</span><span class="p">.</span><span class="n">tcb</span><span class="p">.</span><span class="n">stack_pointer</span><span class="p">));</span>
<span class="c1">// Pop the registers that were stored</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"pop {R4-R11, LR}"</span><span class="p">);</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"BX LR"</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Creating and adding tasks is simply a matter of</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="n">stack1</span><span class="p">[</span><span class="mi">512</span><span class="p">];</span>
<span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="n">stack2</span><span class="p">[</span><span class="mi">512</span><span class="p">];</span>
<span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="n">stack3</span><span class="p">[</span><span class="mi">512</span><span class="p">];</span>
<span class="k">auto</span> <span class="n">task1_handle</span> <span class="o">=</span> <span class="n">rr</span><span class="p">.</span><span class="n">create_task</span><span class="p">(</span><span class="n">task1</span><span class="p">,</span> <span class="nb">nullptr</span><span class="p">,</span> <span class="n">stack1</span><span class="o">+</span><span class="mi">512</span><span class="p">);</span>
<span class="k">auto</span> <span class="n">task2_handle</span> <span class="o">=</span> <span class="n">rr</span><span class="p">.</span><span class="n">create_task</span><span class="p">(</span><span class="n">task2</span><span class="p">,</span> <span class="nb">nullptr</span><span class="p">,</span> <span class="n">stack2</span><span class="o">+</span><span class="mi">512</span><span class="p">);</span>
<span class="k">auto</span> <span class="n">task3_handle</span> <span class="o">=</span> <span class="n">rr</span><span class="p">.</span><span class="n">create_task</span><span class="p">(</span><span class="n">task3</span><span class="p">,</span> <span class="nb">nullptr</span><span class="p">,</span> <span class="n">stack3</span><span class="o">+</span><span class="mi">512</span><span class="p">);</span>
<span class="n">rr</span><span class="p">.</span><span class="n">add_task</span><span class="p">(</span><span class="n">task1_handle</span><span class="p">);</span>
<span class="n">rr</span><span class="p">.</span><span class="n">add_task</span><span class="p">(</span><span class="n">task2_handle</span><span class="p">);</span>
<span class="n">rr</span><span class="p">.</span><span class="n">add_task</span><span class="p">(</span><span class="n">task3_handle</span><span class="p">);</span>
</code></pre></div></div>
<p>And voilá we have started to implement a basic round-robin scheduler, with support for an arbitrary
number of tasks. Adding new tasks is very easy, removing tasks can easily be supported, having
multiple lists and move the tasks between them is a breeze. All thanks to C++, templates and boost.</p>
<h2 id="some-thoughts">Some thoughts</h2>
<p>Now I have really started to appreciate the value of using C++ over C. Doing something similar in C would not be nearly as
easy, and seriously doubt it would be as typesafe as in C++. This implementation became very straight-forward once I started using
<code class="highlighter-rouge">boost::intrusive::list</code>. I already have some ideas on how to implement for instance join, or sleep_for. But that is
for another day.</p>In Multitasking 1 of n I showed how I started implementing some very basic multitasking functionality, using cooperative multitasking. In that post there was no real logic when switching between tasks; I just created two tasks and switched between them in the most basic manner possible. If i wanted to add a third task I would have to change the code in multiple places and make sure things were kept in sync. There was no easy way to support removing tasks. I therefore chose to start implementing my first actual scheduler; a round robin scheduler.Multitasking 1 of n2019-02-23T00:00:00+00:002019-02-23T00:00:00+00:00/blog/2019/02/23/multitasking-1<p>One of the main features an OS provides is the ability to perform multiple tasks simultaneously,
or at least simulate it so it seems that way.</p>
<p>My, quite small, code base before writing this post did not provide any fascilities at all for multitasking
and I myself had never written any multitasking code for embedded either. But I wanted to at least start
getting some multitasking going. This is also a good exercise to learn more about a MCU since I have to learn
more about the gritty things like calling conventions, context switching, stacks, etc.</p>
<h2 id="choosing-a-style">Choosing a style</h2>
<p>To try and keep things relatively simple, and easier to reason about, I chose to start with
cooperative multitasking. This is a style of multitasking that relies on tasks to be good citizens and
yield control to the kernel every now and then. A task will never be interrupted by the kernel itself.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">task_1</span><span class="p">()</span> <span class="p">{</span>
<span class="k">while</span><span class="p">(</span><span class="nb">true</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// Do some work</span>
<span class="n">yield</span><span class="p">();</span> <span class="c1">// Kernel may switch task here</span>
<span class="c1">// Do some more work</span>
<span class="n">yield</span><span class="p">();</span> <span class="c1">// Kernel may switch task here</span>
<span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>
<p>This style is easier to reason about since a task has full control over when other tasks
may run, making things like data-races pretty much impossible (if it is possible I sure would
like an example).</p>
<h2 id="small-steps">Small steps</h2>
<p>A first step to multitasking was to get the following code to run and work:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="n">stack</span><span class="p">[</span><span class="mi">2</span><span class="p">][</span><span class="mi">512</span><span class="p">];</span> <span class="c1">// Buffers to use as stacks</span>
<span class="kt">void</span> <span class="nf">yield</span><span class="p">()</span> <span class="p">{</span>
<span class="c1">// Do stuff</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">task1</span><span class="p">()</span> <span class="p">{</span>
<span class="kt">int</span> <span class="n">var</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">std</span><span class="o">::</span><span class="n">printf</span><span class="p">(</span><span class="s">"In task1 main, var=%d</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">var</span><span class="o">++</span><span class="p">);</span>
<span class="n">yield</span><span class="p">();</span>
<span class="n">std</span><span class="o">::</span><span class="n">printf</span><span class="p">(</span><span class="s">"In task1 after yield, var=%d"</span><span class="p">,</span> <span class="n">var</span><span class="p">);</span>
<span class="k">while</span><span class="p">(</span><span class="nb">true</span><span class="p">)</span> <span class="p">{</span>
<span class="n">yield</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">task2</span><span class="p">()</span> <span class="p">{</span>
<span class="kt">int</span> <span class="n">var</span> <span class="o">=</span> <span class="mi">234</span><span class="p">;</span>
<span class="n">std</span><span class="o">::</span><span class="n">printf</span><span class="p">(</span><span class="s">"In task2 main, var=%d</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">var</span><span class="o">++</span><span class="p">);</span>
<span class="n">yield</span><span class="p">();</span>
<span class="n">std</span><span class="o">::</span><span class="n">printf</span><span class="p">(</span><span class="s">"In task2 after yield, var=%d"</span><span class="p">,</span> <span class="n">var</span><span class="p">);</span>
<span class="k">while</span><span class="p">(</span><span class="nb">true</span><span class="p">)</span> <span class="p">{</span>
<span class="n">yield</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
<span class="c1">// Do something that kicks off the sequence</span>
<span class="p">}</span>
</code></pre></div></div>
<p>I would expect this to print the following output</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In task1 main, var=1
In task2 main, var=234
In task1 after yield, var=2
In task2 after yield, var=235
</code></pre></div></div>
<p>I know the optimizer can remove the variables, but I still felt that this was a good first
start.</p>
<h2 id="the-yield-function">The yield function</h2>
<p>The yield function is where most of the magic happens. What we basicly want to do is
to execute some set of instructions so that when we return from the yield function we don’t
return back to where we previously came from; we want to return back into the other
main function (<code class="highlighter-rouge">task1 -> yield -> task2 -> yield -> task1</code> etc.) and continue where
that function left off (or start the function in the case of the first yield into <code class="highlighter-rouge">task2</code>).</p>
<p>To do this we need to understand how returns work, and how return addresses are stored when
making function calls.</p>
<p>When calling a function the processor uses some kind of branching instruction, and reading
the ARM instruction set we see that there are a bunch of them.</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>B ; Branch immediate
BL ; Branch with link
BX ; Branch indirect (register)
BLX ; Branch indirect with link (register)
</code></pre></div></div>
<p>Reading about these we learn that <code class="highlighter-rouge">BL</code> and <code class="highlighter-rouge">BLX</code> write the address of the next instruction,
after the branch, to the link register (<code class="highlighter-rouge">LR</code> or <code class="highlighter-rouge">R14</code>). So apparently we should be able to
modify the <code class="highlighter-rouge">LR</code> content to be able to control where the <code class="highlighter-rouge">yield</code> function returns back to.
This gives us the first clue to the puzzle.</p>
<h2 id="contexts">Contexts</h2>
<p>Before we can fill in the rest of the puzzle we must know more about the registers.
Specifically we need to know what registers must be saved before jumping to a new function,
so that we can resume the old function correctly again. Storing the current state of a processor
and restoring some old state is called <code class="highlighter-rouge">context switching</code>. In its most basic form its simply
the act of storing some values to some storage, and restoring new values from a different storage before resuming execution. Hopefully we are now executing as if the old state was never left.</p>
<p>So to dig into how to accomplish this I had to read up on how context switches works on an ARM
Cortex M3. I quickly learned that the EABI calling convention says that the registers
<code class="highlighter-rouge">R4-R11</code> together with <code class="highlighter-rouge">LR</code> are <em>callee-saved</em>. That is, they must be saved by a function before they can be used
inside the function, and they must be restored to their old values before returning from the function.
The other registers are <em>caller-save</em> and must be stored before branching.</p>
<h2 id="the-stack">The stack</h2>
<p>Each context can be associated with a stack. A stack is a blob of memory that the processor can <code class="highlighter-rouge">push</code> and <code class="highlighter-rouge">pop</code> values
to/from. The MCU has yet another special register, the stack-pointer (<code class="highlighter-rouge">SP</code>), that keeps track of the stack. This register
is incremented or decremented automatically with each <code class="highlighter-rouge">push/pop</code> execution. Now that means we also need to keep track of
the stack-pointer each time we perform a content switch, and each task should have its own stack since that is part of its
execution state.</p>
<p>We can however utilize the stack to store the register values for us, and since pushing and poping is something very common
there are utility instructions that can help us with this.</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>push {R4-R11, LR} ; push values from registers R4 to R11 and LR on to the stack
pop {R4-R11, LR} ; pop values from the stack into registers R4 to R11 and LR
</code></pre></div></div>
<h3 id="push-and-pop">Push and pop</h3>
<p><code class="highlighter-rouge">push</code> and <code class="highlighter-rouge">pop</code> are actually short-hand instructions for the <code class="highlighter-rouge">stmdb sp!, <reglist></code> and <code class="highlighter-rouge">ldmia sp!, <reglist></code> where
<code class="highlighter-rouge">stmdb</code> means <em>store multiple registers, decrement address before access</em> and <code class="highlighter-rouge">ldmia</code> means <em>load multiple registers, increment address after access</em>.
<code class="highlighter-rouge">sp!</code> means that the address should be read from <code class="highlighter-rouge">sp</code> and the newly calculated address is also written back to <code class="highlighter-rouge">sp</code> again. <code class="highlighter-rouge"><reglist></code>
can be for instance <code class="highlighter-rouge">{R4-R11, LR}</code> or something similar.
In C-style pseudo-code this can be described as</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">std</span><span class="o">::</span><span class="kt">uint32_t</span> <span class="o">*</span><span class="n">sp</span><span class="p">;</span>
<span class="c1">// stmdb</span>
<span class="n">sp</span><span class="o">--</span><span class="p">;</span>
<span class="o">*</span><span class="n">sp</span> <span class="o">=</span> <span class="n">register_value</span><span class="p">;</span>
<span class="c1">// ldmia</span>
<span class="n">register_value</span> <span class="o">=</span> <span class="o">*</span><span class="n">sp</span><span class="p">;</span>
<span class="n">sp</span><span class="o">++</span><span class="p">;</span>
</code></pre></div></div>
<p>In reality the pointer points to a single byte though, not a 32 bit value as in the code above.</p>
<p>This makes it possible to use some other register than <code class="highlighter-rouge">sp</code> and still get the same functionality, which can come in handy
when initializing a stack for a new task.</p>
<h2 id="the-first-context-switch">The first context switch</h2>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include <cstdio>
#include <cstdint>
</span>
<span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="n">stack</span><span class="p">[</span><span class="mi">2</span><span class="p">][</span><span class="mi">512</span><span class="p">];</span>
<span class="c1">// Let the stack pointers point to the bottom of the stacks</span>
<span class="c1">// Remember that a push decrements the pointer before "dereferencing"</span>
<span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="o">*</span><span class="n">sp</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">=</span> <span class="p">{</span><span class="n">stack</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">+</span> <span class="mi">512</span><span class="p">,</span> <span class="n">stack</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">+</span> <span class="mi">512</span><span class="p">};</span>
<span class="kt">int</span> <span class="n">current_task</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="c1">// Attributes noinline and naked tells the compiler that</span>
<span class="c1">// it shouldn't inline the function, and naked tells it to</span>
<span class="c1">// not generate any prolouge or epilouge, such as pushing/popping</span>
<span class="c1">// registers it intends to use. Since the first and last thing we</span>
<span class="c1">// do is a complete context save/restore, everything in between can freely</span>
<span class="c1">// use the registers as it pleases since they will be scratched anyway.</span>
<span class="kt">void</span> <span class="nf">__attribute__</span><span class="p">((</span><span class="n">noinline</span><span class="p">,</span> <span class="kr">naked</span><span class="p">))</span> <span class="n">yield</span><span class="p">()</span> <span class="p">{</span>
<span class="c1">// Store current register contexts to the current stack</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"push {R4-R11, LR}"</span><span class="p">);</span>
<span class="c1">// Store the current context stack pointer</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"mov %0, sp"</span> <span class="o">:</span><span class="s">"=r"</span><span class="p">(</span><span class="n">sp</span><span class="p">[</span><span class="n">current_task</span><span class="p">]));</span>
<span class="c1">// Calculate which stack to move to</span>
<span class="n">current_task</span> <span class="o">=</span> <span class="p">(</span><span class="n">current_task</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">%</span> <span class="mi">2</span><span class="p">;</span>
<span class="c1">// Load the new stack pointer</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"mov sp, %0"</span> <span class="o">::</span> <span class="s">"r"</span><span class="p">(</span><span class="n">sp</span><span class="p">[</span><span class="n">current_task</span><span class="p">]));</span>
<span class="c1">// Pop the registers that were stored</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"pop {R4-R11, LR}"</span><span class="p">);</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"BX LR"</span><span class="p">);</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">task1</span><span class="p">()</span> <span class="p">{</span>
<span class="kt">int</span> <span class="n">var</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
<span class="n">std</span><span class="o">::</span><span class="n">printf</span><span class="p">(</span><span class="s">"In task1 main, var=%d</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">var</span><span class="o">++</span><span class="p">);</span>
<span class="n">yield</span><span class="p">();</span>
<span class="n">std</span><span class="o">::</span><span class="n">printf</span><span class="p">(</span><span class="s">"In task1 after yield, var=%d</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">var</span><span class="p">);</span>
<span class="k">while</span><span class="p">(</span><span class="nb">true</span><span class="p">)</span> <span class="p">{</span>
<span class="n">yield</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">task2</span><span class="p">()</span> <span class="p">{</span>
<span class="kt">int</span> <span class="n">var</span> <span class="o">=</span> <span class="mi">234</span><span class="p">;</span>
<span class="n">std</span><span class="o">::</span><span class="n">printf</span><span class="p">(</span><span class="s">"In task2 main, var=%d</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">var</span><span class="o">++</span><span class="p">);</span>
<span class="n">yield</span><span class="p">();</span>
<span class="n">std</span><span class="o">::</span><span class="n">printf</span><span class="p">(</span><span class="s">"In task2 after yield, var=%d</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">var</span><span class="p">);</span>
<span class="k">while</span><span class="p">(</span><span class="nb">true</span><span class="p">)</span> <span class="p">{</span>
<span class="n">yield</span><span class="p">();</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="p">[[</span><span class="n">noreturn</span><span class="p">]]</span> <span class="kt">int</span> <span class="n">main</span><span class="p">()</span> <span class="p">{</span>
<span class="c1">// Setup the stack for task2 so that a yield will start it</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"mov sp, %0"</span> <span class="o">::</span> <span class="s">"r"</span><span class="p">(</span><span class="n">sp</span><span class="p">[</span><span class="mi">1</span><span class="p">]));</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"mov lr, %0"</span> <span class="o">::</span> <span class="s">"r"</span><span class="p">(</span><span class="n">task2</span><span class="p">));</span>
<span class="c1">// R* registers contains garbage, but we don't really</span>
<span class="c1">// care. We simply want the pop executed in yield to</span>
<span class="c1">// pop the correct value into LR, so we push</span>
<span class="c1">// an entire context to the stack but only LR is of importance.</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"push {R4-R11, lr}"</span><span class="p">);</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"mov %0, sp"</span> <span class="o">:</span><span class="s">"=r"</span><span class="p">(</span><span class="n">sp</span><span class="p">[</span><span class="mi">1</span><span class="p">]));</span>
<span class="c1">// Set SP to be stack of task1 and branch to its main</span>
<span class="c1">// no need to perform any special setup since we just do a</span>
<span class="c1">// a branch immediate. The yield will take care of the correct</span>
<span class="c1">// book-keeping for us.</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"mov sp, %0"</span> <span class="o">::</span> <span class="s">"r"</span><span class="p">(</span><span class="n">sp</span><span class="p">[</span><span class="mi">0</span><span class="p">]));</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"mov r0, %0"</span> <span class="o">::</span> <span class="s">"r"</span><span class="p">(</span><span class="n">task1</span><span class="p">));</span>
<span class="c1">// Off we go</span>
<span class="k">asm</span><span class="p">(</span><span class="s">"BX r0"</span><span class="p">);</span>
<span class="c1">// Will never get here but we mark main with noreturn and</span>
<span class="c1">// loop here anyway.</span>
<span class="k">while</span><span class="p">(</span><span class="nb">true</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>And the output is indeed</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>In task1 main, var=1
In task2 main, var=234
In task1 after yield, var=2
In task2 after yield, var=235
</code></pre></div></div>
<h2 id="finally">Finally</h2>
<p>This was a first context switch using cooperative task switching. From here it is
a long road before we have even a remotely usable OS, nevermind a real-time OS,
but small progress is still progress. Next I want to work on tying in a scheduler with
all this, at which point using C++ instead of C might make more of a difference. Also
stuff like task joining will come next.</p>One of the main features an OS provides is the ability to perform multiple tasks simultaneously, or at least simulate it so it seems that way.Adding a display2019-02-17T00:00:00+00:002019-02-17T00:00:00+00:00/blog/2019/02/17/adding-a-display<p>The <code class="highlighter-rouge">lm3s811evb</code> QEMU machine comes with an emulated OLED display,
which would be a nice thing to get going. Having a screen immediately
allows us to show and interact in more ways than just through the serial port.
<em>Code now also available in a <a href="https://gitlab.com/AndWass/eppos">Gitlab</a> repo.</em></p>
<p>The <a href="http://people.redhat.com/pbonzini/qemu-test-doc/_build/html/topics/ARM-System-emulator.html">documentation</a> tells us that it is a 96x16 OLED display using an SSD0303
controller connected to the <a href="https://en.wikipedia.org/wiki/I%C2%B2C">I2C</a> bus. The documentation regarding this setup is
generally scarce, the best source of information is actually the QEMU source
code. The source is quite easy to read though, so don’t be scared to use it as a source of information.</p>
<h2 id="reference-information">Reference information</h2>
<ul>
<li><a href="http://www.ti.com/lit/ds/symlink/lm3s811.pdf">Stellaris LM3S811 datasheet</a>
<ul>
<li>I2C slave is not implemented</li>
</ul>
</li>
<li><a href="https://cdn-shop.adafruit.com/datasheets/SSD1306.pdf">SSD1306 datasheet</a>
<ul>
<li>Didn’t find the datasheet for SSD0303, but this appears to function the same</li>
</ul>
</li>
<li><a href="https://github.com/Xilinx/qemu/blob/master/hw/arm/stellaris.c">QEMU stellaris source code</a></li>
<li><a href="https://github.com/Xilinx/qemu/blob/master/hw/display/ssd0303.c">QEMU SSD1306 source code</a></li>
</ul>
<p>Some reference information regarding the QEMU setup with the SSD0303</p>
<ul>
<li>SSD0303 I2C address: 0x3D (<a href="https://github.com/Xilinx/qemu/blob/master/hw/arm/stellaris.c#L1348">Source</a>)</li>
<li>Reading from the SSD0303 is not implemented</li>
</ul>
<h2 id="embedded-drivers">Embedded drivers</h2>
<p>Most of the embedded code I have both read and written do a really poor
job of seperating drivers. It is not uncommon to have a driver be a driver
for both the I2C communication itself, which is very MCU-specific, and for the
integrated circuit you communicate with.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kt">unsigned</span> <span class="kt">char</span> <span class="n">command_buffer</span><span class="p">[</span><span class="mi">32</span><span class="p">];</span>
<span class="kt">void</span> <span class="nf">dev_init</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// Initialize MCU for I2C communication</span>
<span class="p">...</span>
<span class="c1">// Do some target device initialization</span>
<span class="p">...</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">dev_do_something</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// Use MCU registers to send some commands over the I2C bus</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Or in some other way having a very tight coupling between the IC
and the I2C driver.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kt">unsigned</span> <span class="kt">char</span> <span class="n">command_buffer</span><span class="p">[</span><span class="mi">32</span><span class="p">];</span>
<span class="kt">void</span> <span class="nf">dev_init</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span> <span class="p">{</span>
<span class="c1">// Do some target device initialization</span>
<span class="n">i2c_write_data</span><span class="p">(</span><span class="n">buffer</span><span class="p">,</span> <span class="n">length</span><span class="p">);</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">dev_do_something</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span> <span class="p">{</span>
<span class="n">i2c_write_data</span><span class="p">(</span><span class="n">buffer</span><span class="p">,</span> <span class="n">length</span><span class="p">);</span>
<span class="n">i2c_read_data</span><span class="p">(</span><span class="n">read_buffer</span><span class="p">,</span> <span class="n">read_length</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The second code example is slightly better than the first example, but only just.
It is not uncommon for an MCU to have multiple I2C buses and reusing
the driver for the chip device on a different hardware platform would still
require porting the driver to each new platform. Code reuse does <strong>not</strong> scale,
bugs needs to be fixed in all variations of the driver etc.</p>
<h2 id="reusable-driver-in-c">Reusable driver in C</h2>
<p>One quite common technique to allow drivers to talk to eachother in C is to create a
<code class="highlighter-rouge">struct</code> of function pointers. These structs can be passed to other drivers and they can use
the function pointers to talk with the correct driver.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// i2c_dev_info tells which hardware I2C to initialize</span>
<span class="n">i2c_dev</span> <span class="n">i2c</span> <span class="o">=</span> <span class="n">i2c_create</span><span class="p">(</span><span class="n">i2c_dev_info</span><span class="p">);</span>
<span class="c1">// Allows the SSD0303 device to use I2C functionality</span>
<span class="n">display_dev</span> <span class="n">disp</span> <span class="o">=</span> <span class="n">ssd0303_create</span><span class="p">(</span><span class="o">&</span><span class="n">i2c</span><span class="p">,</span> <span class="cm">/* Other options */</span><span class="p">);</span>
<span class="n">disp</span><span class="p">.</span><span class="n">set_pixel</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">pixel_on</span><span class="p">);</span> <span class="c1">// Change some pixel value</span>
<span class="n">disp</span><span class="p">.</span><span class="n">update</span><span class="p">();</span> <span class="c1">// Show the change</span>
</code></pre></div></div>
<p>This is pretty much only a re-implementation of virtual member-functions in C++
though, so why not use that?</p>
<h2 id="easier-registry-access">Easier registry access</h2>
<p>To allow for easier registry accress I created a class template that
both performs the necessary magin and also gives access to some helper
functions, such as a <code class="highlighter-rouge">reg.set_bit</code> class.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">template</span><span class="o"><</span><span class="n">std</span><span class="o">::</span><span class="kt">uintptr_t</span> <span class="n">Add</span><span class="p">,</span> <span class="k">class</span> <span class="nc">T</span> <span class="n">Access</span> <span class="o">=</span> <span class="n">read_write</span><span class="o">></span>
<span class="k">struct</span> <span class="n">peripheral_reg</span> <span class="p">{</span>
<span class="c1">// ... some details.</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The nice thing about the access level is that I can disable write
functionality for read only parts and let the compiler catch such errors.</p>
<h2 id="first-i2c-driver">First I2C driver</h2>
<p>The first I2C driver will be very simple. It will be blocking and only
support master write operation, since that is what SSD0303 supports in QEMU.
I created a virtual base class <code class="highlighter-rouge">driver::i2c::master</code>.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">master</span>
<span class="p">{</span>
<span class="k">enum</span> <span class="k">class</span> <span class="nc">operation</span><span class="o">:</span> <span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span>
<span class="p">{</span>
<span class="n">write</span><span class="p">,</span>
<span class="n">read</span>
<span class="p">};</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">reset</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">set_slave_address</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="n">slave</span><span class="p">,</span> <span class="n">operation</span> <span class="n">op</span><span class="p">)</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">start</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="n">data</span><span class="p">)</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">start_stop</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="n">data</span><span class="p">)</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">virtual</span> <span class="kt">bool</span> <span class="n">is_busy</span><span class="p">()</span> <span class="k">const</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">virtual</span> <span class="kt">bool</span> <span class="n">is_error</span><span class="p">()</span> <span class="k">const</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">virtual</span> <span class="kt">bool</span> <span class="n">is_arbitrition_lost</span><span class="p">()</span> <span class="k">const</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">repeat_start</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="n">data</span><span class="p">)</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">stop</span><span class="p">()</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">virtual</span> <span class="kt">void</span> <span class="n">stop</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="n">data</span><span class="p">)</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">virtual</span> <span class="o">~</span><span class="n">master</span><span class="p">()</span> <span class="o">=</span> <span class="k">default</span><span class="p">;</span>
<span class="k">virtual</span> <span class="kt">bool</span> <span class="n">write</span><span class="p">(</span><span class="n">gsl</span><span class="o">::</span><span class="n">span</span><span class="o"><</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span><span class="o">></span> <span class="n">data</span><span class="p">);</span>
<span class="p">};</span>
</code></pre></div></div>
<p>It basicly contains a bunch of methods meant to be implemented by
hardware-specific drivers. It also contains a basic write function
that uses other defined functions to implement I2C write functionality.
It is marked virtual in case the target platform has built-in support for
writing I2C data, via DMA for example. The stellaris MCU does not have any
special support for this though, so it uses the default algorithm.</p>
<p>Now a single I2C chip can be described as a chip with a bus-unique address
connected to a specific I2C bus. So the master operation above can be wrapped
in a chip struct of its own.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">chip</span>
<span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="n">slave_address</span><span class="p">;</span>
<span class="n">master</span> <span class="o">*</span><span class="n">master_impl</span><span class="p">;</span>
<span class="n">chip</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="n">slave_address</span><span class="p">,</span> <span class="n">master</span> <span class="o">*</span><span class="n">master_impl</span><span class="p">)</span><span class="o">:</span>
<span class="n">slave_address</span><span class="p">(</span><span class="n">slave_address</span><span class="p">),</span> <span class="n">master_impl</span><span class="p">(</span><span class="n">master_impl</span><span class="p">)</span>
<span class="p">{</span>
<span class="p">}</span>
<span class="kt">bool</span> <span class="n">write</span><span class="p">(</span><span class="n">gsl</span><span class="o">::</span><span class="n">span</span><span class="o"><</span><span class="k">const</span> <span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span><span class="o">></span> <span class="n">data</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">master_impl</span><span class="o">-></span><span class="n">set_slave_address</span><span class="p">(</span><span class="n">slave_address</span><span class="p">,</span> <span class="n">master</span><span class="o">::</span><span class="n">operation</span><span class="o">::</span><span class="n">write</span><span class="p">);</span>
<span class="k">return</span> <span class="n">master_impl</span><span class="o">-></span><span class="n">write</span><span class="p">(</span><span class="n">data</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>
<p>and all of a sudden we have (in my opinion) a very clean way of writing
to an I2C chip.</p>
<p>The above I2C combination can be used to create the basis of an <code class="highlighter-rouge">ssd0303</code>
struct. Note that I have so far not created an interface for a display class.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">ssd0303</span>
<span class="p">{</span>
<span class="n">driver</span><span class="o">::</span><span class="n">i2c</span><span class="o">::</span><span class="n">master</span><span class="o">::</span><span class="n">chip</span> <span class="n">chip</span><span class="p">;</span>
<span class="n">std</span><span class="o">::</span><span class="n">array</span><span class="o"><</span><span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span><span class="p">,</span> <span class="mi">132</span> <span class="o">+</span> <span class="mi">1</span><span class="o">></span> <span class="n">pixel_command_buffer</span><span class="p">[</span><span class="mi">2</span><span class="p">];</span>
<span class="n">ssd0303</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="n">i2c_address</span><span class="p">,</span> <span class="n">driver</span><span class="o">::</span><span class="n">i2c</span><span class="o">::</span><span class="n">master</span> <span class="o">*</span><span class="n">i2c_bus</span><span class="p">);</span>
<span class="kt">void</span> <span class="n">setup</span><span class="p">();</span>
<span class="kt">void</span> <span class="n">turn_off</span><span class="p">();</span>
<span class="kt">void</span> <span class="n">turn_on</span><span class="p">();</span>
<span class="kt">void</span> <span class="n">send_command</span><span class="p">(</span><span class="n">std</span><span class="o">::</span><span class="kt">uint8_t</span> <span class="n">command</span><span class="p">);</span>
<span class="kt">void</span> <span class="n">set_pixel</span><span class="p">(</span><span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">x</span><span class="p">,</span> <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">y</span><span class="p">,</span> <span class="kt">bool</span> <span class="n">on</span><span class="p">);</span>
<span class="kt">void</span> <span class="n">update</span><span class="p">();</span>
<span class="p">};</span>
</code></pre></div></div>
<p>Using this is very straight-forward:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include <driver/i2c/stellaris.hpp>
#include <driver/display/ssd0303.hpp>
</span>
<span class="p">[[</span><span class="n">noreturn</span><span class="p">]]</span> <span class="kt">int</span> <span class="n">main</span><span class="p">()</span>
<span class="p">{</span>
<span class="n">driver</span><span class="o">::</span><span class="n">i2c</span><span class="o">::</span><span class="n">stellaris</span><span class="o">::</span><span class="n">i2c1_master</span> <span class="n">i2c1</span><span class="p">;</span>
<span class="n">driver</span><span class="o">::</span><span class="n">display</span><span class="o">::</span><span class="n">ssd0303</span> <span class="n">ssd0303</span><span class="p">(</span><span class="mh">0x7a</span><span class="p">,</span> <span class="o">&</span><span class="n">i2c1</span><span class="p">);</span>
<span class="n">i2c1</span><span class="p">.</span><span class="n">setup</span><span class="p">();</span>
<span class="n">ssd0303</span><span class="p">.</span><span class="n">setup</span><span class="p">();</span>
<span class="c1">// Set some pixel data</span>
<span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">y</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">y</span><span class="o"><</span><span class="mi">16</span><span class="p">;</span> <span class="n">y</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
<span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">x</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">x</span><span class="o"><</span><span class="mi">132</span><span class="p">;</span> <span class="n">x</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
<span class="n">ssd0303</span><span class="p">.</span><span class="n">set_pixel</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span> <span class="n">x</span> <span class="o">%</span> <span class="mi">2</span> <span class="o">==</span> <span class="mi">0</span> <span class="o">&&</span> <span class="n">y</span> <span class="o">%</span> <span class="mi">2</span> <span class="o">==</span> <span class="mi">0</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="c1">// Update the display with new content.</span>
<span class="n">ssd0303</span><span class="p">.</span><span class="n">update</span><span class="p">();</span>
<span class="k">while</span><span class="p">(</span><span class="nb">true</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<h2 id="setup-methods">Setup methods</h2>
<p>A careful reader can see that I have <code class="highlighter-rouge">setup</code> methods and don’t really
use the constructor to setup a driver or peripheral. This isn’t idiomatic
C++. In the embedded world it isn’t uncommon to setup the peripheral,
briefly do some work and then shut it down to disable clocks and go to sleep.
Data stored in internal driver buffers should probably be conserved though so
a <code class="highlighter-rouge">construct-work-descruct</code> sequence is not ideal either.</p>
<p>I certianly haven’t settled on one approach, but this felt for the moment closer
to how peripherals often are used.</p>The lm3s811evb QEMU machine comes with an emulated OLED display, which would be a nice thing to get going. Having a screen immediately allows us to show and interact in more ways than just through the serial port. Code now also available in a Gitlab repo.Debugging QEMU2019-02-13T00:00:00+00:002019-02-13T00:00:00+00:00/blog/2019/02/13/debugging-qemu<p>In <a href="/blog/2019/02/12/hello-eppos.html">Hello eppOS</a> I got a basic <code class="highlighter-rouge">Hello World</code> going
with stdout via UART. This allows us to print messages and have them
appear in the console in much the same as a normal program. Now we should add debugging
capabilities to the mix as well.</p>
<h2 id="gdb-and-qemu">GDB and QEMU</h2>
<p>The command we haved used so far to launch QEMU has been <code class="highlighter-rouge">qemu-system-arm -M lm3s811evb -s -kernel build/eppos</code>.
This simply starts the emulator and immediately starts running the code. This is less than ideal when we want
to debug our code since we might miss what we actually want to look at. So when debugging we should
use the following command instead: <code class="highlighter-rouge">qemu-system-arm -M lm3s811evb -s -S -kernel build/eppos</code>. Notice the extra
<code class="highlighter-rouge">-S</code> (capital <code class="highlighter-rouge">S</code>), which will freeze the MCU at startup. This allows us to attach a debugger before any code is being run.</p>
<p>The option <code class="highlighter-rouge">-s</code> (lowercase <code class="highlighter-rouge">s</code>) tells QEMU to allow gdb connections on TCP port 1234. So these two options together is
all we need to start debugging.</p>
<h2 id="first-connection">First connection</h2>
<p>For this you will need two terminals available. Start the emulator with the command <code class="highlighter-rouge">qemu-system-arm -M lm3s811evb -s -S -kernel build/eppos</code>
in the first terminal. After this, use the second terminal and run the command <code class="highlighter-rouge">arm-none-eabi-gdb build/eppos</code>. This
will open a GDB shell. Now to connect to the emulator enter the following command in the GDB shell: <code class="highlighter-rouge">target remote localhost:1234</code>.</p>
<p>You should now be connected to the program running inside QEMU.</p>
<p><img src="/blog/images/blog/gdb_connected.png" alt="QEMU GDB" /></p>
<p>The program can be controlled and inspected as usual via the GDB terminal.</p>
<h2 id="vscode-integration">VSCode integration</h2>
<p>Now being able to use the GDB terminal for debugging is nice and all, but I like to debug using a GUI. And since
I chose VSCode as an “IDE” for eppOS, we should integrate with this. The only thing we have to do to integrate
with VSCode is to add a launch configuration to the project-local <code class="highlighter-rouge">.vscode/launch.json</code>.</p>
<p>My entire <code class="highlighter-rouge">launch.json</code> file looks like this</p>
<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
</span><span class="s2">"version"</span><span class="p">:</span><span class="w"> </span><span class="s2">"0.2.0"</span><span class="p">,</span><span class="w">
</span><span class="s2">"configurations"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
</span><span class="p">{</span><span class="w">
</span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"C++ Launch"</span><span class="p">,</span><span class="w">
</span><span class="s2">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"cppdbg"</span><span class="p">,</span><span class="w">
</span><span class="s2">"request"</span><span class="p">:</span><span class="w"> </span><span class="s2">"launch"</span><span class="p">,</span><span class="w">
</span><span class="s2">"program"</span><span class="p">:</span><span class="w"> </span><span class="s2">"${workspaceRoot}/build/eppos"</span><span class="p">,</span><span class="w">
</span><span class="s2">"miDebuggerServerAddress"</span><span class="p">:</span><span class="w"> </span><span class="s2">"localhost:1234"</span><span class="p">,</span><span class="w">
</span><span class="s2">"args"</span><span class="p">:</span><span class="w"> </span><span class="p">[],</span><span class="w">
</span><span class="s2">"stopAtEntry"</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="p">,</span><span class="w">
</span><span class="s2">"cwd"</span><span class="p">:</span><span class="w"> </span><span class="s2">"${workspaceRoot}"</span><span class="p">,</span><span class="w">
</span><span class="s2">"environment"</span><span class="p">:</span><span class="w"> </span><span class="p">[],</span><span class="w">
</span><span class="s2">"externalConsole"</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span><span class="w">
</span><span class="s2">"launchCompleteCommand"</span><span class="p">:</span><span class="w"> </span><span class="s2">"exec-run"</span><span class="p">,</span><span class="w">
</span><span class="s2">"windows"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="s2">"MIMode"</span><span class="p">:</span><span class="w"> </span><span class="s2">"gdb"</span><span class="p">,</span><span class="w">
</span><span class="s2">"miDebuggerPath"</span><span class="p">:</span><span class="w"> </span><span class="s2">"C:</span><span class="se">\\</span><span class="s2">Program Files (x86)</span><span class="se">\\</span><span class="s2">GNU Tools ARM Embedded</span><span class="se">\\</span><span class="s2">8 2018-q4-major</span><span class="se">\\</span><span class="s2">bin</span><span class="se">\\</span><span class="s2">arm-none-eabi-gdb.exe"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>
<p>The important configurations is <code class="highlighter-rouge">"miDebuggerServerAddress": "localhost:1234"</code> and</p>
<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="s2">"windows"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="s2">"MIMode"</span><span class="p">:</span><span class="w"> </span><span class="s2">"gdb"</span><span class="p">,</span><span class="w">
</span><span class="s2">"miDebuggerPath"</span><span class="p">:</span><span class="w"> </span><span class="s2">"C:</span><span class="se">\\</span><span class="s2">Program Files (x86)</span><span class="se">\\</span><span class="s2">GNU Tools ARM Embedded</span><span class="se">\\</span><span class="s2">8 2018-q4-major</span><span class="se">\\</span><span class="s2">bin</span><span class="se">\\</span><span class="s2">arm-none-eabi-gdb.exe"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>
<p>If you are using Linux you will obviously have to change <code class="highlighter-rouge">"windows"</code> to <code class="highlighter-rouge">"linux"</code> and change the <code class="highlighter-rouge">"miDebuggerPath"</code> value
and so on.</p>
<p>With this simple launch configuration we can again launch QEMU for debugging, but instead of running GDB
in a terminal simply start debugging in VSCode. Pretty nifty!</p>
<p>Note thought that we are compiling with <code class="highlighter-rouge">-Os</code> (set in the toolchain file) which means
optimize for size, so even though we have debug symbols available and everything, you might get some weird
debug results, like lines being skipped etc.</p>In Hello eppOS I got a basic Hello World going with stdout via UART. This allows us to print messages and have them appear in the console in much the same as a normal program. Now we should add debugging capabilities to the mix as well.Hello eppOS2019-02-12T00:00:00+00:002019-02-12T00:00:00+00:00/blog/2019/02/12/hello-eppos<p>To have an actual project project to learn from I have decided to start working on
a basic embedded OS. True to the open source community every project needs a
name, so I have decided to call the OS eppOS, which is a (bad) acronym for
<em>Embedded C<strong>++</strong> OS</em>. This will be my playground that I can use to play with code
and features. The goal is <strong>not</strong> to develop a production-ready OS.</p>
<h2 id="development-tools">Development tools</h2>
<p>First order of business is to get a development environment up and running. I want
to be able to use both Windows and Linux. I also don’t want to depend on hardware,
at least not initially.</p>
<p>This means that I have decided to use the following tools:</p>
<ul>
<li><a href="https://developer.arm.com/open-source/gnu-toolchain/gnu-rm">The official GNU ARM embedded toolchain</a> from ARM.</li>
<li><a href="https://www.qemu.org/">QEMU</a></li>
<li><a href="https://cmake.org/">CMake</a></li>
<li>On Windows I use <a href="https://ninja-build.org/">Ninja</a> as CMake generator</li>
</ul>
<p>For developing and debugging I use <a href="https://code.visualstudio.com/">VSCode</a> with the <em>C/C++</em>, <em>CMake</em> and
<em>CMake Tools</em> extensions.</p>
<h3 id="installation">Installation</h3>
<p>Everything is very straight-forward to install. On Linux there is a good chance that a package is
available in the standard package manager, so that it is enough to run something like <code class="highlighter-rouge">apt-get install gcc-arm-none-eabi</code>.
On Windows it is easiest to download the individual setups, which should be available for pretty much
all tools. For QEMU you will want to make sure that <code class="highlighter-rouge">qemu-system-arm</code> is installed.</p>
<p>On Windows you will probably want to add the different installation locations to your <code class="highlighter-rouge">PATH</code> environment
variable as well.</p>
<h3 id="cmake-and-vscode">CMake and VSCode</h3>
<p>One of the main reasons I chose to use CMake and VSCode was the excellent <em>CMake Tools</em> extensions. I have made some
minimal changes to the default settings, like setting <code class="highlighter-rouge">Ninja</code> as the default generator on Windows. Some
project-specific configurations have also been added but these will be explained further down.</p>
<h2 id="choosing-an-mcu-target">Choosing an MCU target</h2>
<p>I want to keep the MCU target rather small, but also reasonably modern. In my line of work I frequently come in
contact with Cortex M MCUs, like Cortex M3 and M4. QEMU only has a few machines that targets Cortex M3s so I chose
to target the <a href="http://www.ti.com/lit/ds/symlink/lm3s811.pdf">Stellaris lm3s811evb</a> machine.</p>
<h2 id="first-program"><a name="first_program"></a>First program</h2>
<p>To test the toolchain and also test the development environment the first goal was to get a simple <code class="highlighter-rouge">Hello World</code>
to build and actually work in the emulator. The goal is to run this via QEMU, and somewhere get the “Hello World”
printout.</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include <cstdio>
</span>
<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">printf</span><span class="p">(</span><span class="s">"Hello World</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
<span class="c1">// Remember we are the OS, so do not return from main</span>
<span class="k">while</span><span class="p">(</span><span class="nb">true</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>If this small program works we can actually start interacting, at least one-way, with the MCU.</p>
<h2 id="project-layout">Project layout</h2>
<p>For this first small program I assume the following structure:</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>project
| Source and CMake files
|
|---build
| | CMake build directory
|
|---.vscode
| VSCode configuration files
</code></pre></div></div>
<h3 id="cmake-and-toolchains">CMake and toolchains</h3>
<p>The actual first order of business is to tell CMake which compiler to use. I opted for using
a <a href="https://cmake.org/cmake/help/v3.12/manual/cmake-toolchains.7.html">toolchain</a> file to have
the basic setup in a single file. After some trial and error (and googling) I ended up with the following
rather clobbered toolchain file:</p>
<div class="language-cmake highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># On Windows we want to append .exe to the compiler name.</span>
<span class="nb">if</span><span class="p">(</span>WIN32<span class="p">)</span>
<span class="nb">set</span><span class="p">(</span>CC_POSTFIX <span class="s2">".exe"</span><span class="p">)</span>
<span class="nb">else</span><span class="p">()</span>
<span class="nb">set</span><span class="p">(</span>CC_POSTFIX <span class="s2">""</span><span class="p">)</span>
<span class="nb">endif</span><span class="p">()</span>
<span class="nb">set</span><span class="p">(</span>CMAKE_SYSTEM_NAME Generic<span class="p">)</span>
<span class="nb">set</span><span class="p">(</span>CMAKE_SYSTEM_PROCESSOR arm<span class="p">)</span>
<span class="nb">set</span><span class="p">(</span>CMAKE_C_COMPILER <span class="s2">"</span><span class="si">${</span><span class="nv">ARM_COMPILER_PATH</span><span class="si">}</span><span class="s2">/arm-none-eabi-gcc</span><span class="si">${</span><span class="nv">CC_POSTFIX</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span>
<span class="nb">set</span><span class="p">(</span>CMAKE_CXX_COMPILER <span class="s2">"</span><span class="si">${</span><span class="nv">ARM_COMPILER_PATH</span><span class="si">}</span><span class="s2">/arm-none-eabi-g++</span><span class="si">${</span><span class="nv">CC_POSTFIX</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span>
<span class="nb">set</span><span class="p">(</span>CMAKE_CXX_FLAGS <span class="s2">"-g -mthumb -mcpu=cortex-m3 -L </span><span class="si">${</span><span class="nv">CMAKE_CURRENT_LIST_DIR</span><span class="si">}</span><span class="s2"> -T lm3s811evb.ld"</span><span class="p">)</span>
<span class="nb">set</span><span class="p">(</span>CMAKE_CXX_FLAGS <span class="s2">"</span><span class="si">${</span><span class="nv">CMAKE_CXX_FLAGS</span><span class="si">}</span><span class="s2"> --specs=nosys.specs --specs=nano.specs -Os -ffreestanding"</span><span class="p">)</span>
<span class="nb">set</span><span class="p">(</span>CMAKE_CXX_FLAGS <span class="s2">"</span><span class="si">${</span><span class="nv">CMAKE_CXX_FLAGS</span><span class="si">}</span><span class="s2"> -fno-rtti -fno-exceptions -fno-non-call-exceptions"</span><span class="p">)</span>
<span class="nb">set</span><span class="p">(</span>CMAKE_CXX_FLAGS <span class="s2">"</span><span class="si">${</span><span class="nv">CMAKE_CXX_FLAGS</span><span class="si">}</span><span class="s2"> -fno-common -ffunction-sections -fdata-sections"</span><span class="p">)</span>
<span class="nb">set</span><span class="p">(</span>CMAKE_CXX_FLAGS <span class="s2">"</span><span class="si">${</span><span class="nv">CMAKE_CXX_FLAGS</span><span class="si">}</span><span class="s2"> -Wl,--gc-sections"</span><span class="p">)</span>
<span class="nb">set</span><span class="p">(</span>CMAKE_C_COMPILER_WORKS ON<span class="p">)</span>
<span class="nb">set</span><span class="p">(</span>CMAKE_CXX_COMPILER_WORKS ON<span class="p">)</span>
<span class="nf">SET</span><span class="p">(</span>ASM_OPTIONS <span class="s2">"-x assembler-with-cpp"</span><span class="p">)</span>
<span class="nf">SET</span><span class="p">(</span>CMAKE_ASM_FLAGS <span class="s2">"</span><span class="si">${</span><span class="nv">CMAKE_CXX_FLAGS</span><span class="si">}</span><span class="s2"> </span><span class="si">${</span><span class="nv">ASM_OPTIONS</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span>
</code></pre></div></div>
<p>Now when I run CMake I only have to provide the ARM_COMPILER_PATH variable with a full path to <code class="highlighter-rouge">arm-none-eabi-{gcc, g++}</code>. In
VSCode this can easily be done by adding the following to the <code class="highlighter-rouge">.vscode/settings.json</code> file:</p>
<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="s2">"cmake.configureSettings"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="s2">"ARM_COMPILER_PATH"</span><span class="p">:</span><span class="w"> </span><span class="s2">"C:/Program Files (x86)/GNU Tools ARM Embedded/8 2018-q4-major/bin"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>
<p>To make use of the toolchain file in an integrated way a CMake kit can be defined. Create the file
<code class="highlighter-rouge">.vscode/cmake-kits.json</code> and add the following</p>
<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">[{</span><span class="w">
</span><span class="s2">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Local toolchain"</span><span class="p">,</span><span class="w">
</span><span class="s2">"toolchainFile"</span><span class="p">:</span><span class="w"> </span><span class="s2">"../toolchain.cmake"</span><span class="w">
</span><span class="p">}]</span><span class="w">
</span></code></pre></div></div>
<p>The <em>CMake Tools</em> extensions can now automatically find and use the toolchain file when building
the project.</p>
<h3 id="startup-and-linker-files">Startup and linker files</h3>
<p>For the MCU to even start our program we must make sure the executable contains some information
in the correct place. All Cortex M3s (and possibly even all Cortex Ms?) start in the same manner:
read the stack pointer from flash address <code class="highlighter-rouge">0x00</code> then read the value from flash address <code class="highlighter-rouge">0x04</code>
and jump to the corresponding address. In very basic pseudocode it does the following:</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>stack_pointer = *0x00 # *0x00 denotes uint32 value stored in location 0x00
jump *0x04
</code></pre></div></div>
<p>From there on the MCU is started and it is up to us to decide what happends next.</p>
<p>After much googling (yes I do that quite often) I managed to find that the ARM toolchain comes with
some samples that are an excellent starting-point. In <code class="highlighter-rouge"><installation directory>/share/gcc-arm-none-eabi/samples</code>
we will find both startup code and linker-scripts. And for our current scenario they are a perfect
starting point.</p>
<p>I copied <code class="highlighter-rouge">gcc.ld</code> from <code class="highlighter-rouge">ldscripts</code> folder and placed it in the project directory, and renamed it to
<code class="highlighter-rouge">lm3s811evd.ld</code> in the process.</p>
<p>The linker file has to be modified slightly to match our target. The original file has RAM starting at
address <code class="highlighter-rouge">0x10000000</code> while the lm3s811 MCU RAM starts at <code class="highlighter-rouge">0x20000000</code>.</p>
<pre><code class="language-ldscript">MEMORY
{
FLASH (rx) : ORIGIN = 0x0, LENGTH = 0x10000 /* 128K */
RAM (rwx) : ORIGIN = 0x20000000, LENGTH = 0x2000 /* 8K */
}
</code></pre>
<p>I also copied <code class="highlighter-rouge">startup_ARMCM3.S</code> from <code class="highlighter-rouge">startup</code> and placed it in the project directory, renaming it to
simply <code class="highlighter-rouge">startup.S</code>. The <code class="highlighter-rouge">.S</code> suffix means that it contains assembly code.</p>
<p>I will most likely get back to these files and discuss them further as I need to both learn more,
but also when they have to be modified. For now they don’t have to be modified further though.</p>
<h3 id="retargetting">Retargetting</h3>
<p>Our goal is to perform a complete retarget, where we want to emulate using <code class="highlighter-rouge">printf</code> to send
serial data. The ARM toolchain again comes with a basic example we can use as a starting point.
In <code class="highlighter-rouge">samples/src/retarget</code> we can copy both <code class="highlighter-rouge">main.c</code> and <code class="highlighter-rouge">retarget.c</code> to our project folder. Make
sure to change them to C++ files by changing the extensions to <code class="highlighter-rouge">.cpp</code> or similar as well.</p>
<p>Since C++ uses name mangling we need to disable this for the functions in <code class="highlighter-rouge">retarget.cpp</code> as well.
This is most easily done by adding <code class="highlighter-rouge">extern "C" {</code> at the top of the source file, with a closing
brace at the bottom.</p>
<p>What this file does is it <em>“overrides”</em> some functions that the standard library uses to implement
functions like <code class="highlighter-rouge">printf</code>.</p>
<p>Now we must implement at least one function in a platform-specific manner; the <code class="highlighter-rouge">_write</code> function.
We want to emulate sending data to a serial port, or a <a href="https://en.wikipedia.org/wiki/Universal_asynchronous_receiver-transmitter">UART</a>,
which is basicly a serial port but with different voltage levels. To send data via the UART port
we need to write the data to the peripheral register located on address <code class="highlighter-rouge">0x4000C000</code>. This can be
achieved by creating a pointer that points to this specific address: <code class="highlighter-rouge">volatile std::uint32_t* uart0dr = reinterpret_cast<std::uint32_t*>(0x4000C000);</code>.
This can then be used as such <code class="highlighter-rouge">*uart0dr = 'A' // Writes the character 'A'</code>.</p>
<p>The entire <code class="highlighter-rouge">_write</code> function is very basic:</p>
<div class="language-c++ highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">_write</span> <span class="p">(</span><span class="kt">int</span> <span class="n">fd</span><span class="p">,</span> <span class="kt">char</span> <span class="o">*</span><span class="n">ptr</span><span class="p">,</span> <span class="kt">int</span> <span class="n">len</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">volatile</span> <span class="n">std</span><span class="o">::</span><span class="kt">uint32_t</span><span class="o">*</span> <span class="n">uart0dr</span> <span class="o">=</span> <span class="k">reinterpret_cast</span><span class="o"><</span><span class="n">std</span><span class="o">::</span><span class="kt">uint32_t</span><span class="o">*></span><span class="p">(</span><span class="mh">0x4000C000</span><span class="p">);</span>
<span class="cm">/* Write "len" of char from "ptr" to file id "fd"
* Return number of char written.
* Need implementing with UART here. */</span>
<span class="p">(</span><span class="kt">void</span><span class="p">)</span><span class="n">fd</span><span class="p">;</span>
<span class="k">for</span><span class="p">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="p">;</span> <span class="n">i</span><span class="o"><</span><span class="n">len</span><span class="p">;</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">uart0dr</span> <span class="o">=</span> <span class="n">ptr</span><span class="p">[</span><span class="n">i</span><span class="p">];</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">len</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Change the content of <code class="highlighter-rouge">main.cpp</code> to what is shown in <a href="#first_program">First program</a>.</p>
<h2 id="putting-it-all-together">Putting it all together</h2>
<p>The only thing missing is a CMakeLists.txt before we can build and run, and hopefully
have QEMU show us the <code class="highlighter-rouge">Hello World</code> we are longing for. The CMakeLists.txt is very basic,
mine only contains</p>
<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>cmake_minimum_required(VERSION 3.1)
project(eppos)
enable_language(CXX ASM)
set(CXX_STANDARD 17)
add_executable(eppos main.cpp startup.S retarget.cpp)
</code></pre></div></div>
<p>With this the content of the project directory should be</p>
<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>CMakeLists.txt
toolchain.cmake
startup.S
lm3s811evb.ld
main.cpp
retarget.cpp
.vscode/settings.json
.vscode/cmake-kits.json
</code></pre></div></div>
<p>And that is it. With this the project should build.</p>
<h3 id="testing">Testing</h3>
<p>The final step is to test it in QEMU. Using a terminal located
in the project root directory the following command should work:
<code class="highlighter-rouge">qemu-system-arm -M lm3s811evb -serial stdio -s -kernel build/eppos</code>. The QEMU window should
show up and the terminal should print <code class="highlighter-rouge">Hello World</code></p>
<p><img src="/blog/images/blog/hello_world_terminal.png" alt="Hello world terminal" /></p>
<h2 id="summary">Summary</h2>
<p>This was my first time documenting all the steps I had to take to get a <em>“basic”</em> (nothing is basic
in the MCU world) <code class="highlighter-rouge">Hello World</code> application going. I am sure I don’t follow best practices, I know
there are some shortcuts taken so the above will not work on a physical MCU.
There are probably some errors as well. But as guide to how I got everything going, this should be rather complete.</p>
<p>Next step will be to get a debugger running, and connecting it to the QEMU emulator. Stay tuned!</p>To have an actual project project to learn from I have decided to start working on a basic embedded OS. True to the open source community every project needs a name, so I have decided to call the OS eppOS, which is a (bad) acronym for Embedded C++ OS. This will be my playground that I can use to play with code and features. The goal is not to develop a production-ready OS.Hello world2019-02-11T00:00:00+00:002019-02-11T00:00:00+00:00/blog/2019/02/11/hello-world<p>My name is Andreas, I am a little over 30 years old and have always had an
interest in computers. It was quite logical that I would get into software
development; I started learning C++ as a hobby in my teenage years. My first
(and so far only) book is, shamefully, the dreaded <em>Learn C++ programming in 21 days</em>.</p>
<h2 id="experience">Experience</h2>
<p>I have worked as a consultant for a product development company since 2011. For
a long time I was the only software person at my work. Some electrical engineers
would also do some programming but when things became advanced they would turn to me.</p>
<p>The job has given me a wide range of experience, from using small Atmel AVR microcontrollers
to hacking the Linux kernel for usage in Embedded Linux, and developing Desktop applications.
I usually say I can, and will, do anything but web development.</p>
<p>When it comes to embedded development for microcontrollers I have mostly used C.
I had almost no prior experience when I started work and I inherited quite alot
of code that was pretty much exclusively C so it was only natural that I would
continue down the C path. C++ is starting to get more and more interesting though,
and I want to learn more around what C++ can bring to the table.</p>
<h2 id="blog-goal">Blog goal</h2>
<p>The goal with this blog is to document what I learn that can be applicable to embedded C++.
My long term goal is to write an OS in C++, with focus on running on smaller ARM microcontrollers,
such as Cortex M3 or M4. The OS is intended as a learning exercise, both to learn more around ARM
microcontrollers but also what C++ can bring to the table that C struggles with. Eventually I would
like to start using C++ at work in the embedded projects we have there.</p>My name is Andreas, I am a little over 30 years old and have always had an interest in computers. It was quite logical that I would get into software development; I started learning C++ as a hobby in my teenage years. My first (and so far only) book is, shamefully, the dreaded Learn C++ programming in 21 days.