You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: _posts/2019-03-10-riscv-from-scratch-1.markdown
+6-6
Original file line number
Diff line number
Diff line change
@@ -7,23 +7,23 @@ description: A post that discusses what RISC-V is and why it's important, teache
7
7
---
8
8
9
9
{: .no_toc}
10
-
#### Table of contents
10
+
<divid="table-of-contents">Table of contents</div>
11
11
1. TOC
12
12
{:toc}
13
13
14
-
###Introduction
14
+
## Introduction
15
15
16
16
Welcome to part one of *RISC-V from scratch*! Throughout *RISC-V from scratch* we will explore various low-level concepts (compilation and linking, primitive runtimes, assembly, and more), typically through the lens of RISC-V and its ecosystem. I am a web developer by trade, and as such I'm not exposed to these things on a daily basis. However, I think they are very interesting - hence this series! Join me on a very much unstructured journey into the depths of all things low-level.
17
17
18
18
In this first post, we'll talk a little bit about what RISC-V is and why it's important, set up a RISC-V toolchain, and finish up with building and running a simple C program on emulated RISC-V hardware.
19
19
20
-
###So what is RISC-V?
20
+
## So what is RISC-V?
21
21
22
22
RISC-V is an open-source, free-to-use ISA that began as a project at UC-Berkeley in 2010. The free-to-use aspect has been instrumental in its success and is quite a stark contrast to many other architectures. Take ARM for example - in order to create an ARM-compatible processor, you must pay an upfront fee of [$1M - $10M as well as a 0.5% - 2% royalty fee per-chip](https://www.anandtech.com/show/7112/the-arm-diaries-part-1-how-arms-business-model-works/2). This free and open model makes RISC-V an attractive option to many groups of people - hardware startups who can't foot the bill to create an ARM or other licensing-required processor, academic institutions, and (obviously) the open-source community.
23
23
24
24
RISC-V's meteoric rise in popularity hasn't gone unnoticed. [ARM launched a now-taken down website](https://abopen.com/news/rattled-arm-launches-anti-risc-v-marketing-campaign/) that attempted (rather unsuccessfully) to highlight supposed benefits of ARM over RISC-V. RISC-V is backed by [a ton of major companies](https://riscv.org/membership/members/), including Google, Nvidia, and Western Digital.
25
25
26
-
###QEMU and RISC-V toolchain setup
26
+
## QEMU and RISC-V toolchain setup
27
27
28
28
We won't be able to run any code on a RISC-V processor until we have an environment to do it in. Fortunately, we don't need a physical RISC-V processor to do this - we'll instead be using [qemu](https://www.qemu.org). To install `qemu`, follow the [instructions for your operating system here](https://www.qemu.org/download). I'm using MacOS, so for me this was as easy as:
Et voilà, we have a working RISC-V toolchain! All our executables, such as `riscv64-unknown-elf-gcc`, `riscv64-unknown-elf-gdb`, `riscv64-unknown-elf-ld`, etc, are located in `~/usys/riscv/riscv64-unknown-elf-gcc-<date>-<version>/bin/`.
This is a great start, but my goal with these blog posts is to truly [shave the yak](https://seths.blog/2005/03/dont_shave_that/), and while we have confirmed that we have a working toolchain, there is a lot of magic hidden by the niceties of the `freedom-e-sdk` examples. Note that we didn't have to set up any linker files or startup code - SiFive's provided board-support linker scripts, various Makefiles, and the [freedom-metal library](https://github.com/sifive/freedom-metal) take care of this for us.
Copy file name to clipboardExpand all lines: _posts/2019-04-27-riscv-from-scratch-2.markdown
+11-11
Original file line number
Diff line number
Diff line change
@@ -7,19 +7,19 @@ description: A post describing how C programs get to the main function. Devicet
7
7
---
8
8
9
9
{: .no_toc}
10
-
#### Table of contents
10
+
<divid="table-of-contents">Table of contents</div>
11
11
1. TOC
12
12
{:toc}
13
13
14
-
###Introduction
14
+
## Introduction
15
15
16
16
Welcome to the second post in the *RISC-V from scratch* series! As a quick recap, throughout *RISC-V from scratch* we will explore various low-level concepts (compilation and linking, primitive runtimes, assembly, and more), typically through the lens of RISC-V and its ecosystem. In [the first post of this series]({% post_url 2019-03-10-riscv-from-scratch-1 %}), we introduced RISC-V, explained why it's important, set up the full GNU RISC-V toolchain, and built and ran a simple program on an emulated version of a RISC-V processor with the help of [SiFive's freedom-e-sdk](https://github.com/sifive/freedom-e-sdk).
17
17
18
18
The `freedom-e-sdk` made it trivial for us to compile, debug, and run any C program on an emulated or physical RISC-V processor. We didn't have to worry about setting up any linker scripts or writing a runtime that sets up our stack, calls into `main`, and more. This is great if you're looking to quickly become productive, but these details are exactly the sort of thing we want to learn about!
19
19
20
20
In this post, we'll break free from the `freedom-e-sdk`. We'll write and attempt to debug a simple C program of our own, unveil the magic hidden behind `main`, and examine the hardware layout of a `qemu` virtual machine. We'll then examine and modify a linker script, write our own C runtime to get our program set up and running, and finally invoke GDB and step through our program.
21
21
22
-
###Setup
22
+
## Setup
23
23
24
24
If you missed the previous post in this series and don't have `riscv-qemu` and the RISC-V toolchain installed and were hoping to follow along, jump to the ["QEMU and RISC-V toolchain setup"](/riscv-from-scratch/2019/03/10/riscv-from-scratch-1.html#qemu-and-risc-v-toolchain-setup) section (or in RISC-V assembly, `jal x0, qemu_and_toolchain_setup`) and complete that before moving on.
25
25
@@ -36,7 +36,7 @@ cd riscv-from-scratch/work
36
36
37
37
As the name suggests, the `work` directory will serve as our working directory for this and future posts.
38
38
39
-
###The naive approach
39
+
## The naive approach
40
40
41
41
Let's start our journey by using the text editor of your choice to create a simple C program called `add.c` that infinitely adds two numbers together.
42
42
@@ -142,7 +142,7 @@ There are several red flags here:
142
142
143
143
These indicators, in combination with the fact that we never hit a breakpoint, signals we have done _something_ wrong. But what is it?
144
144
145
-
###Lifting the `-v`eil
145
+
## Lifting the `-v`eil
146
146
147
147
To figure out what's going on here, we need to take a detour and talk about how our simple C program actually works underneath the surface. We have a function called `main` that does our simple addition, but what _is_`main`, really? Why must it be called `main` and not `origin`, or `begin`, or `entry`? Conventionally we know that all executables start running at `main`, but what magic occurs to make this happen?
148
148
@@ -179,7 +179,7 @@ Knowing this, we see that GCC is linking multiple different `crt` object files w
179
179
180
180
What exactly this bootstrapping of initial execution is depends on the platform in question, but generally it includes important tasks such as setting up the stack frame, passing along command line arguments, and calling into `main`. Yes, we have _finally_ answered the question posed at the beginning of this section - it is `_start` who calls into our `main` function!
181
181
182
-
###Finding our stack
182
+
## Finding our stack
183
183
184
184
We've solved one mystery, but you might be wondering how this gets us any closer to our original goal of being able to step through our simple C program with `gdb`. There are a few problems we have left to address, but the first we have has to do with the way `crt0` is setting up our stack.
185
185
@@ -242,7 +242,7 @@ head -n8 riscv64-virt.dts
242
242
243
243
And there we have it - it takes two 32-bit values (cells) to specify an address, and two 32-bit values to specify length. This means, given `reg = <0x00 0x80000000 0x00 0x8000000>;`, our memory begins at `0x00 + 0x80000000` (`0x80000000`) and extends `0x00` + `0x8000000` (`0x8000000`) bytes, meaning it ends at `0x88000000`. In more human-friendly terms, we can use a hexadecimal calculator to determine that our length of `0x8000000` bytes is 128 megabytes.
244
244
245
-
###Link it up
245
+
## Link it up
246
246
247
247
Using `qemu` and `dtc`, we've successfully discovered where the RAM lives and how long it extends in our `virt` virtual machine. We also know that `gcc` is linking a default `crt0` that isn't setting up our stack the way we need it to. But what exactly do we do with this information, and how does it get us any closer to getting a running, debuggable program?
248
248
@@ -319,7 +319,7 @@ SECTIONS
319
319
320
320
As you can see, we use the [PROVIDE command](https://web.archive.org/web/20190525173911/https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/4/html/Using_ld_the_GNU_Linker/assignments.html#PROVIDE) to define a symbol called `__stack_top`. `__stack_top` will be accessible from any program linked with this script (assuming the program itself does not also define something named `__stack_top`). We set the value of `__stack_top` to be `ORIGIN(RAM)`, which we know is `0x80000000`, plus `LENGTH(RAM)`, which we know is 128 megabytes (`0x8000000` bytes). This means our `__stack_top` is set to `0x88000000`.
We finally have all we need to create a custom C runtime that works for us, so let's get started. Create a file called `crt0.s` in the `riscv-from-scratch/work/` directory and insert the following:
@@ -433,7 +433,7 @@ _start:
433
433
434
434
Our very last line is an assembler directive, `.end`, which simply marks the end of the assembly file.
435
435
436
-
###Debugging, but for real this time
436
+
## Debugging, but for real this time
437
437
438
438
To recap, we've worked through many problems in our quest of debugging a simple C program on a RISC-V processor. We first used `qemu` and `dtc` to find where our memory was located in the `virt` virtual RISC-V machine. We then used this information to take manual control of the memory layout in our customized version of the default `riscv64-unknown-elf-ld` linker script, which then enabled us to accurately define a `__stack_top` symbol. We finished by using this symbol in our own custom `crt0.s` that set up our stack and global pointers and finally called the `main` function. Let's make use of all this work to complete our original goal of debugging our simple C program in GDB.
439
439
@@ -527,14 +527,14 @@ You'll notice from the above output that we have successfully hit a breakpoint o
527
527
528
528
From here we can use `gdb` as normal - `s` to step to the next instruction, `info all-registers` to inspect the values inside our registers as our program executes, so on and so forth. Experiment to your hearts content...we certainly worked hard enough to get here!
529
529
530
-
###What's next
530
+
## What's next
531
531
532
532
In our next post, we'll continue to build on our knowledge of RISC-V assembly by beginning implementation of a driver for the UART onboard the `virt` QEMU machine. Expect to learn about what a UART is and how it works, additional devicetree properties, the basic building blocks required to implement an NS16550A-compatible UART driver, and more.
533
533
534
534
Sound interesting? This post has been released - [click here to check it out](https://twilco.github.io{% post_url 2019-07-08-riscv-from-scratch-3 %}). If you have any questions, comments, or corrections, feel free to [open up an issue](https://github.com/twilco/twilco.github.io/issues) or leave a comment below via [utterances](https://github.com/utterance/utterances).
535
535
536
536
Thanks for reading!
537
537
538
-
###Extra credit
538
+
## Extra credit
539
539
540
540
If you enjoyed this post and want event more, [Matt Godbolt gave a presentation titled "The Bits Between the Bits: How We Get to main()"](https://www.youtube.com/watch?v=dOfucXtyEsU) at CppCon2018 that approaches this subject from a few different angles than we took here in this post. If you've worked through the entirety of this post you will definitely recognize some of the things he covers. It's a good talk, so check it out!
Copy file name to clipboardExpand all lines: _posts/2019-07-08-riscv-from-scratch-3.markdown
+8-8
Original file line number
Diff line number
Diff line change
@@ -7,11 +7,11 @@ description: A post beginning implementation of a NS16550A UART driver for the Q
7
7
---
8
8
9
9
{: .no_toc}
10
-
#### Table of contents
10
+
<divid="table-of-contents">Table of contents</div>
11
11
1. TOC
12
12
{:toc}
13
13
14
-
###Introduction
14
+
## Introduction
15
15
16
16
Welcome to the third post in the *RISC-V from scratch* series! As a quick recap, throughout *RISC-V from scratch* we will explore various low-level concepts (compilation and linking, primitive runtimes, assembly, and more), typically through the lens of RISC-V and its ecosystem.
17
17
@@ -23,7 +23,7 @@ If this is the first post in this series that you are tuning into and would like
23
23
24
24
So, without further ado, let's begin.
25
25
26
-
###What is a UART?
26
+
## What is a UART?
27
27
28
28
UART stands for "**U**niversal **A**synchronous **R**eceiver-**T**ransmitter", and is a physical hardware device (_not_ a protocol, à la [I2C](https://en.wikipedia.org/wiki/I%C2%B2C) or [SPI](https://en.wikipedia.org/wiki/Serial_Peripheral_Interface)) used to transmit and receive serial data. Serial data transmission is the process of sending data sequentially, bit-by-bit. In contrast, parallel data transmission is the process of sending multiple bits all at once. This image from the [serial communication Wikipedia page](https://en.wikipedia.org/wiki/Serial_communication) illustrates the difference well:
29
29
@@ -37,7 +37,7 @@ You may also be familiar with USARTs (**U**niversal **S**ynchronous/**A**synchro
37
37
38
38
UARTs and USARTs are all around you, even if you may not realize it. They are built into nearly every modern microcontroller, our `virt` machine included. These devices help power the traffic lights you yield to, the refrigerator that cools your food, and the satellites that orbit the Earth for years on end.
39
39
40
-
###Setup
40
+
## Setup
41
41
42
42
Before we get down to writing our driver, we'll need a few things set up to ensure we can properly compile and link. If you've worked through the previous two posts in this series you shouldn't have to do anything here beyond a `cd some/path/to/riscv-from-scratch`.
43
43
@@ -67,7 +67,7 @@ cp -a src/. work
67
67
68
68
If you're curious to know more about this customized linker script and minimal C runtime, check out the [previous post]({% post_url 2019-04-27-riscv-from-scratch-2 %}).
69
69
70
-
###Hardware layout in review
70
+
## Hardware layout in review
71
71
72
72
Before we begin writing our driver, we'll need a little bit more information. How do we configure the UART that's onboard `virt`? At what memory address can we find the receive and transmission buffers?
73
73
@@ -120,7 +120,7 @@ Our next property is `reg = <0x00 0x10000000 0x00 0x100>;`, which determines the
120
120
121
121
This brings us to the last property in our `uart` node, `compatible = "ns16550a";`, which informs us what programming model our UART is compatible with. Operating systems use this property to determine what device drivers it can use for a peripheral. There are plentiful resources showing all the details necessary to implement a NS16550A-compatible UART, including [this one](https://www.lammertbies.nl/comm/info/serial-uart.html) which we'll be referencing from here on out.
122
122
123
-
###Creating the basic skeleton of our driver
123
+
## Creating the basic skeleton of our driver
124
124
125
125
We have all we need to begin writing our driver, so let's begin. Start by ensuring you're in the `riscv-from-scratch/work` directory we created in the setup section:
126
126
@@ -227,7 +227,7 @@ riscv64-unknown-elf-nm a.out
227
227
0000000080000018 T uart_put_char
228
228
{% endhighlight %}
229
229
230
-
###Setting the base address
230
+
## Setting the base address
231
231
232
232
Again referencing [this resource](https://www.lammertbies.nl/comm/info/serial-uart.html), NS16550A UARTs have twelve registers, each accessible from some number byte offset of the base address. In order to be able to get at these registers from our driver code, we'll first need to define a symbol representing this base address. As we discovered from the decompiled devicetree file above, `riscv64-virt.dts`, the base address is located at `0x00 + 0x10000000 = 0x10000000`, as that is what is in the `reg` property:
233
233
@@ -260,7 +260,7 @@ SECTIONS
260
260
261
261
With our `__uart_base_addr` established and codified as a symbol, we'll now have easy access to the NS16550A registers from within our driver file, `ns16550a.s`.
262
262
263
-
###Next steps
263
+
## Next steps
264
264
265
265
Today we learned about UARTs and USARTs, the NS16550A specification, interrupts, and some additional devicetree properties. We also have created a solid skeleton for our UART assembly driver, and have codified the `__uart_base_addr` as a symbol in our linker file for easy UART register access.
0 commit comments