1
0

doc: AI proofread

Assisted-by: Zed:gemini-3.5-flash
This commit is contained in:
2026-05-28 22:49:29 +02:00
parent fd4e890483
commit 5a2d352807
5 changed files with 62 additions and 62 deletions

View File

@@ -38,7 +38,7 @@
#show link: set text(fill: blue.darken(60%)) #show link: set text(fill: blue.darken(60%))
#v(5em) #v(5em)
#infobox()[ #infobox()[
The repository for this labs can be found at the following address: The repository for these labs can be found at the following address:
#align(center)[https://github.com/Klagarge/MSE-MA-CSEL] #align(center)[https://github.com/Klagarge/MSE-MA-CSEL]
] ]

View File

@@ -8,14 +8,14 @@ This laboratory implements a user-space application for the NanoPi NEO Plus2 tha
== Design == Design
The application is based on multithreading: one thread handles the #gls("led", long: false) timing, while another handles button events. #gls("gpio", long: false) are accessed through #gls("sysfs", long: false), which allows the #gls("led", long: false) and buttons to be managed as file descriptors. A key design choice was to centralize all events with a single #gls("epoll", long: false) instance, so both timer events and button events can be processed efficiently. The application is based on multithreading: one thread handles the #gls("led", long: false) timing, while another handles button events. #gls("gpio", long: false) are accessed through #gls("sysfs", long: false), which allows the #gls("led", long: false) and buttons to be managed as file descriptors. A key design choice was to centralize all events with a single #gls("epoll", long: false) instance, so both timer events and button events can be processed efficiently.
The timer thread use only 1 timer and set the initial time on every cycle. That allow to allocate only once in the timer and avoid memory fragmentation. The button thread writes the next time to sleep on a shared variable, and the timer thread read this variable to set the next time to sleep. Since we have only one provider of this variable, we don't need to use a mutex to protect it. The timer thread uses only one timer and sets the initial time on every cycle. This allows us to allocate resources only once for the timer and avoid memory fragmentation. The button thread writes the next sleep duration to a shared variable, which the timer thread reads to set its next sleep interval. Since there is only one writer for this variable, we do not need a mutex to protect it.
All logs are done using the syslog at info level: All logs are written to syslog at the INFO level:
```c ```c
// First, we open the syslog with a specific name and facility // First, we open the syslog with a specific name and facility
// LOG_PID to include the PID (process ID) in the logs // LOG_PID to include the PID (process ID) in the logs
// LOG_USER to specify the log facility (what type of programme) // LOG_USER to specify the log facility (what type of program)
openlog("CSEL Logs", LOG_PID, LOG_USER); openlog("CSEL Logs", LOG_PID, LOG_USER);
// Then log what you want: // Then log what you want:
@@ -23,10 +23,10 @@ syslog(LOG_INFO, "Start logging silly led-controller"); // INFO level
``` ```
== Difficulties == Difficulties
The most difficult part was understanding the #gls("gpio", long: false) mapping between the physical pins and the #gls("sysfs", long: false) #gls("gpio", long: false) numbers. All can be found in the #link("https://linux-sunxi.org/GPIO", [*sunxi driver*]) which is the driver for #gls("gpio", long: false). The most difficult part was understanding the #gls("gpio", long: false) mapping between the physical pins and the #gls("sysfs", long: false) #gls("gpio", long: false) numbers. This mapping can be found in the #link("https://linux-sunxi.org/GPIO", [*sunxi driver*]) documentation, which describes the driver for the #gls("gpio", long: false) controller.
== Results == Results
We can demonstrate that the application works in an efficient than the silly #gls("led", long: false) controller given: We can demonstrate that the application works more efficiently than the provided silly #gls("led", long: false) controller:
#table( #table(
columns: (1fr, 1fr), columns: (1fr, 1fr),
@@ -35,15 +35,15 @@ We can demonstrate that the application works in an efficient than the silly #gl
[ [
#figure( #figure(
image("test-silly.png", height: 10em), image("test-silly.png", height: 10em),
caption:[Run silly #gls("led", long: false) controller on NanoPi] caption:[Running the silly #gls("led", long: false) controller on the NanoPi]
)<fig-silly> )<fig-silly>
],[ ],[
#figure( #figure(
image("test-epoll.png", height: 10em), image("test-epoll.png", height: 10em),
caption:[Run #gls("epoll", long: false) #gls("led", long: false) controller on NanoPi] caption:[Running the #gls("epoll", long: false)-based #gls("led", long: false) controller on the NanoPi]
)<fig-epoll> )<fig-epoll>
] ]
) )
We can see the difference between @fig-silly and @fig-epoll. One is using a core at 100% and the other one not. We can see the difference between @fig-silly and @fig-epoll. One utilizes 100% of a CPU core, whereas the other does not.

View File

@@ -4,21 +4,21 @@
== Process, signals, and communication == Process, signals, and communication
The aim of this laboratory is to create a child process from the parent with `fork()`. Then, each processus executes the same code until they are killed. This happens the same when programming #gls("gpu", long: false) with #gls("cuda", long: false) or #gls("openmp", long: false). The different processus are differenciated by the #gls("pid", long: false). The aim of this laboratory is to create a child process from a parent process using `fork()`. Both processes then execute the same code until they are terminated. This is similar to parallel programming with #gls("gpu", long: false) using #gls("cuda", long: false) or #gls("openmp", long: false). The processes are differentiated by their #gls("pid", long: false).
The child must communicate with the parents with a `socketpair`: The child process must communicate with the parent process using a `socketpair`:
```c ```c
/* Setup socket for inter-process communication */ /* Setup socket for inter-process communication */
int fd[2]; int fd[2];
int err = socketpair(AF_UNIX, SOCK_STREAM, 0, fd); int err = socketpair(AF_UNIX, SOCK_STREAM, 0, fd);
if (err == -1) { if (err == -1) {
perror("socketpair fail");AF_UNIX perror("socketpair fail");
exit(EXIT_FAILURE); exit(EXIT_FAILURE);
} }
``` ```
This creates a local socket for inter-process communication. It return 2 file descriptors to read and write on the same file. This creates a local UNIX socket pair for inter-process communication. It returns two file descriptors for bidirectional communication.
The program must handle some signal and print them: The program must handle several signals and print their names when received:
```c ```c
static void catch_signal(int signal) { static void catch_signal(int signal) {
@@ -58,11 +58,11 @@ static void install_catch_signal()
} }
``` ```
There was one thing to be anticipate. If the `ctrl+c` is handled, it has to exit the process. Because the process will block the terminal. The only way to kill the process is to open in another terminal a tool like `top` or `htop`. One important design consideration to anticipate was signal handling behavior. If `Ctrl+C` (SIGINT) is caught but the handler does not terminate the process, the application would continue to run and block the terminal. In that case, the only way to kill the process would be to open another terminal and use a tool like `top` or `htop`.
Finally, each processus has his own core. This setup with the `sched_setaffinity`: Finally, each process is pinned to its own CPU core. This is configured using `sched_setaffinity`:
```c ```c
/* Setup CPU for process */ /* Setup CPU affinity for process */
CPU_SET(child_cpu, &set); CPU_SET(child_cpu, &set);
int ret = sched_setaffinity(parent_pid, sizeof(set), &set); int ret = sched_setaffinity(parent_pid, sizeof(set), &set);
if (ret == -1) { if (ret == -1) {
@@ -71,12 +71,12 @@ if (ret == -1) {
} }
``` ```
This can be verified by executing the program and observed in the `htop` tool. This can be verified by executing the program and observing CPU usage in `htop`.
```bash ```bash
$ ./multiprocessing $ ./multiprocessing
Child processus: pid=273 Child process: pid=273
Parent processus: pid=274 Parent process: pid=274
Message 0: Hallo, hallo ! Message 0: Hallo, hallo !
Message 1: ça geht ! Message 1: ça geht !
Message 2: Comment vont les olives ? Message 2: Comment vont les olives ?
@@ -91,25 +91,25 @@ SIGINT received
``` ```
#figure( #figure(
image("control_cpu_process_ex_1.png"), image("control_cpu_process_ex_1.png"),
caption: [Execution of the program multiprocessus] caption: [Execution of the multiprocessing program]
)<multiprocessus> )<multiprocessus>
The @multiprocessus shows the #gls("pid", long: false) and the core of the processus and they can be compared to the output of the executable before. The @multiprocessus shows the #gls("pid", long: false) and the assigned CPU core for each process, which can be compared with the console output shown above.
The child processus has the #gls("pid", long: false) 273 and the core 0. The parent processus has the #gls("pid", long: false) 274 and the core 1. The child process has PID 273 and runs on core 0, whereas the parent process has PID 274 and runs on core 1.
== #gls("cgroups", long: false) memory == #gls("cgroups", long: false) memory
The goal of this part is to understand how to use #gls("cgroups", long: false) to limit the resources of a process. We will initially focus on memory, but #gls("cgroups", long: false) can also be used to limit #gls("cpu", long: false), #gls("io", long: false), and other ressources. The goal of this part is to understand how to use #gls("cgroups", long: false) to limit the resources of a process. We will initially focus on memory, but #gls("cgroups", long: false) can also be used to limit #gls("cpu", long: false), #gls("io", long: false), and other resources.
For limit the memory usage of a process, we cans use the `memory` subsystem of #gls("cgroups", long: false). We use #gls("cgroups", long: false) v1 with our Nanopi. To limit the memory usage of a process, we can use the `memory` subsystem of #gls("cgroups", long: false). On this NanoPi, we use #gls("cgroups", long: false) v1.
We must first mount a temporary filesystem for #gls("cgroups", long: false): We must first mount a temporary filesystem for #gls("cgroups", long: false):
```bash ```bash
|> mount -t tmpfs none /sys/fs/cgroup |> mount -t tmpfs none /sys/fs/cgroup
``` ```
We can the create a directory for the memory #gls("cgroups", long: false), mount the #gls("cgroups", long: false) filesystem with memory, and create a subdirectory for our #gls("cgroups", long: false): We can then create a directory for the memory subsystem, mount the corresponding #gls("cgroups", long: false) filesystem, and create a subdirectory for our specific group:
```bash ```bash
# Create a directory for the memory cgroup # Create a directory for the memory cgroup
@@ -122,7 +122,7 @@ We can the create a directory for the memory #gls("cgroups", long: false), mount
|> mkdir /sys/fs/cgroup/memory/0 |> mkdir /sys/fs/cgroup/memory/0
``` ```
We can then add the current process to this memory #gls("cgroups", long: false) and set a memory limit of 20 #gls("mib", long: false): We can then add the current process to this memory cgroup and set a memory limit of 20 #gls("mib", long: false):
```bash ```bash
# Add the current process to the memory cgroup # Add the current process to the memory cgroup
@@ -132,7 +132,7 @@ We can then add the current process to this memory #gls("cgroups", long: false)
|> echo 20M > /sys/fs/cgroup/memory/0/memory.limit_in_bytes |> echo 20M > /sys/fs/cgroup/memory/0/memory.limit_in_bytes
``` ```
We can then run our test program that allocates memory in a loop and see what happens when we exceed the memory limit. We can then run our test program that allocates memory in a loop to see what happens when we exceed the memory limit.
```c ```c
for (i = 0; i < NUM_BLOCKS; i++) { for (i = 0; i < NUM_BLOCKS; i++) {
@@ -148,7 +148,7 @@ for (i = 0; i < NUM_BLOCKS; i++) {
} }
``` ```
We can use the `cgroups.sh` script in `04-multiprocessing` to set up the #gls("cgroups", long: false) and run the test program, but we need to run with the actual context, so we need to execute the script with `.`: We can use the `cgroups.sh` script in `04-multiprocessing` to set up #gls("cgroups", long: false) and run the test program. However, to execute the script in the context of our current shell, we must source it using the `.` command:
```bash ```bash
|> just cgroups # Build the test program |> just cgroups # Build the test program
@@ -158,25 +158,25 @@ We can use the `cgroups.sh` script in `04-multiprocessing` to set up the #gls("c
=== What is the behavior of the command `echo $$ > ...` on #gls("cgroups", long: false)? === What is the behavior of the command `echo $$ > ...` on #gls("cgroups", long: false)?
The `$$` represent the current #gls("pid", long: false). When we execute the command `echo $$ > /sys/fs/cgroup/memory/0/tasks`, we are writing the #gls("pid", long: false) of the current process into the `tasks` file of the specified #gls("cgroups", long: false). This action effectively assigns the process to that #gls("cgroups", long: false), meaning that it will now be subject to the resource limits and policies defined for that #gls("cgroups", long: false). The `$$` shell variable represents the #gls("pid", long: false) of the current shell. When we execute the command `echo $$ > /sys/fs/cgroup/memory/0/tasks`, we write the PID of the current shell process into the `tasks` file of the specified cgroup. This action assigns the process to that control group, meaning that any program run from this shell will inherit the resource limits and policies defined for that cgroup.
=== What is the behavior of the memory subsystem when the memory quota is exhausted? Can we modify it? If yes, how? === What is the behavior of the memory subsystem when the memory quota is exhausted? Can we modify it? If yes, how?
For this nanopi, we use #gls("cgroups", long: false) v1, so the relevant file is `memory.limit_in_bytes`. When a process within a #gls("cgroups", long: false) exceeds the memory limit defined by `memory.limit_in_bytes`, the Linux kernel will attempt to reclaim memory. If it cannot reclaim enough memory, it will invoke the #gls("oom", long: false) killer to terminate processes within that #gls("cgroups", long: false) to free up memory. On this NanoPi, we use #gls("cgroups", long: false) v1, so the resource configuration is done via the `memory.limit_in_bytes` file. When a process within a cgroup exceeds the memory limit defined by this file, the Linux kernel will attempt to reclaim memory. If it cannot reclaim sufficient memory, it will invoke the #gls("oom", long: false) killer to terminate processes within that cgroup to free up memory.
It's possible to modify this behavior in several ways: It is possible to modify this behavior in several ways:
+ Use "Soft Limits" (Specific to #gls("cgroups", long: false) v1) + *Use "Soft Limits" (specific to #gls("cgroups", long: false) v1):*
In addition to a hard limit (`memory.limit_in_bytes`), you can set a soft limit (`memory.soft_limit_in_bytes`). In addition to a hard limit (`memory.limit_in_bytes`), a soft limit can be set via `memory.soft_limit_in_bytes`.
*Behavior:* The kernel will not kill the process if the soft limit is exceeded, unless the entire system is low on global memory. If global memory is low, the kernel will start reclaiming memory from groups that exceed their soft limit. *Behavior:* The kernel does not kill the process when the soft limit is exceeded, unless the entire system runs low on memory. If global memory is low, the kernel begins reclaiming memory from cgroups that exceed their soft limits.
+ Adjust the #gls("oom", long: false) Killer Priority Score + *Adjust the #gls("oom", long: false) Killer priority score:*
If we specify an #gls("oom", long: false) score adjustement for the process. By modifying the file `/proc/[PID]/oom_score_adj` with the value `-1000`, we can make the process almost "immune" to the #gls("oom", long: false) Killer. We can specify an #gls("oom", long: false) score adjustment for the process. By modifying the `/proc/[PID]/oom_score_adj` file to the value `-1000`, the process becomes virtually immune to the #gls("oom", long: false) killer.
=== How to watch the memory usage? === How to watch the memory usage?
We can monitor the memory usage of a #gls("cgroups", long: false) by reading it directly from the file in the specific #gls("cgroups", long: false): We can monitor the memory usage of a control group by reading directly from its configuration files:
```bash ```bash
# Current memory usage in bytes # Current memory usage in bytes
@@ -189,8 +189,8 @@ We can monitor the memory usage of a #gls("cgroups", long: false) by reading it
``` ```
== #gls("cgroups", long: false) CPU == #gls("cgroups", long: false) CPU
To check this part, we need a tiny program that consumes #gls("cpu", long: false) with at least two process. To check this part, we need a tiny program that consumes #gls("cpu", long: false) with at least two processes.
The following program creates a child process that performs #gls("cpu", long: false) intensive work, while the parent process also performs #gls("cpu", long: false) intensive work. We can then use #gls("cgroups", long: false) to limit the #gls("cpu", long: false) usage of one of the processes and observe the effect. The following program creates a child process that performs #gls("cpu", long: false)-intensive work, while the parent process also performs #gls("cpu", long: false)-intensive work. We can then use #gls("cgroups", long: false) to limit the #gls("cpu", long: false) usage of one of the processes and observe the effect.
```c ```c
int main() { int main() {
pid_t pid = fork(); pid_t pid = fork();
@@ -206,12 +206,12 @@ int main() {
} }
``` ```
Based on previous exercice, we should already have mounted the #gls("cgroups", long: false) filesystem. Based on the previous exercise, we should already have mounted the #gls("cgroups", long: false) filesystem.
```bash ```bash
|> mount -t tmpfs none /sys/fs/cgroup |> mount -t tmpfs none /sys/fs/cgroup
``` ```
We can then create and mount the #gls("cgroups", long: false) filesystem for the `cpuset` subsystem We can then create and mount the #gls("cgroups", long: false) filesystem for the `cpuset` subsystem:
```bash ```bash
# Create a directory for the cpuset cgroup # Create a directory for the cpuset cgroup
|> mkdir /sys/fs/cgroup/cpuset |> mkdir /sys/fs/cgroup/cpuset
@@ -220,54 +220,54 @@ We can then create and mount the #gls("cgroups", long: false) filesystem for the
|> mount -t cgroup -o cpu,cpuset cpuset /sys/fs/cgroup/cpuset |> mount -t cgroup -o cpu,cpuset cpuset /sys/fs/cgroup/cpuset
``` ```
Now we had the prerequirements, we can create 2 groupes. One for each of our running programme. With the following command, we attribute one or more #gls("cpu", long: false) to each group (`cpuset.cpus`). I'm not sure about the `cpuset.mems` file, but it seems to be related to memory nodes. It's definetly a topic that should be explored more in depth, but for now, we set to `0` as specified in the lab instructions. With these prerequisites met, we can create two groups, one for each instance of our running program. Using the commands below, we assign one or more #gls("cpu", long: false) cores to each group via `cpuset.cpus`. I'm not sure about the `cpuset.mems` file, but it seems to be related to memory nodes. It's definetly a topic that should be explored more in depth, but for now, we set to `0` as specified in the lab instructions:
```bash ```bash
# Create and allocate CPU for programme "low" # Create and allocate CPU for program "low"
|> mkdir /sys/fs/cgroup/cpuset/low |> mkdir /sys/fs/cgroup/cpuset/low
|> echo 1 > /sys/fs/cgroup/cpuset/low/cpuset.cpus |> echo 1 > /sys/fs/cgroup/cpuset/low/cpuset.cpus
|> echo 0 > /sys/fs/cgroup/cpuset/low/cpuset.mems |> echo 0 > /sys/fs/cgroup/cpuset/low/cpuset.mems
# Create and allocate CPU for programme "high" # Create and allocate CPU for program "high"
|> mkdir /sys/fs/cgroup/cpuset/high |> mkdir /sys/fs/cgroup/cpuset/high
|> echo 2,3 > /sys/fs/cgroup/cpuset/high/cpuset.cpus |> echo 2,3 > /sys/fs/cgroup/cpuset/high/cpuset.cpus
|> echo 0 > /sys/fs/cgroup/cpuset/high/cpuset.mems |> echo 0 > /sys/fs/cgroup/cpuset/high/cpuset.mems
``` ```
We can then open 2 shells and run the test program in each of them, while adding the programme to the corresponding #gls("cgroups", long: false): We can then open two shells and run the test program in each of them, while adding each program to its corresponding control group:
```bash ```bash
# In the first shell, add it on the "low" cgroup and run the test program # In the first shell, add it to the "low" cgroup and run the test program
|> . ./max-cpu.sh low |> . ./max-cpu.sh low
# In the second shell, add it on the "high" cgroup and run the test program # In the second shell, add it to the "high" cgroup and run the test program
|> . ./max-cpu.sh high |> . ./max-cpu.sh high
``` ```
We see on @max-cpu that as expected, both process in program _low_ is limited to #gls("cpu", long: false) 1, while the programm _high_ is using #gls("cpu", long: false) 2 and 3, one for each process. As shown in @max-cpu, as expected, both processes in the "low" program are limited to #gls("cpu", long: false) core 1, while the "high" program uses #gls("cpu", long: false) cores 2 and 3 (one for each process).
#figure( #figure(
image("max-cpu.png"), image("max-cpu.png"),
caption: [CPU usage of the two programmes with dedicated resources] caption: [CPU usage of the two programs with dedicated resources]
)<max-cpu> )<max-cpu>
To share resources at 75% and 25%, we can use the `cpu.shares` file in the `cpu` cgroup. We attribute a value 3 time high for the _high_ group than for the _low_ group. To share resources at 75% and 25%, we can use the `cpu.shares` file in the `cpu` cgroup. We assign a share value to the "high" group that is three times higher than that of the "low" group:
```bash ```bash
|> echo 75 > /sys/fs/cgroup/cpu/high/cpu.shares |> echo 75 > /sys/fs/cgroup/cpu/high/cpu.shares
|> echo 25 > /sys/fs/cgroup/cpu/low/cpu.shares |> echo 25 > /sys/fs/cgroup/cpu/low/cpu.shares
``` ```
Then running the test program in each shell, we see on @shared-cpu that the _high_ process is limited to 75% of the #gls("cpu", long: false), while the _low_ process is limited to 25%. After running the test program in each shell, we can observe in @shared-cpu that the processes in the "high" cgroup are allocated 75% of the CPU capacity, while those in the "low" cgroup receive 25%:
```bash ```bash
# In the first shell, add it on the "low" cgroup and run the test program # In the first shell, add it to the "low" cgroup and run the test program
|> . ./shared-cpu.sh low |> . ./shared-cpu.sh low
# In the second shell, add it on the "high" cgroup and run the test program # In the second shell, add it to the "high" cgroup and run the test program
|> . ./shared-cpu.sh high |> . ./shared-cpu.sh high
``` ```
#figure( #figure(
image("shared-cpu.png"), image("shared-cpu.png"),
caption: [CPU usage of the two programmes with shared resources] caption: [CPU usage of the two programs with shared resources]
)<shared-cpu> )<shared-cpu>

View File

@@ -1,6 +1,6 @@
#import "/doc/metadata.typ": * #import "/doc/metadata.typ": *
= Optimization = Linux System Optimisation
In this laboratory, the usage of `perf` as tool is experimented. In this laboratory, the usage of `perf` as tool is experimented.

View File

@@ -79,9 +79,9 @@ int main(int argc, char* argv[]) {
/* Fork a child process */ /* Fork a child process */
pid_t pid = fork(); pid_t pid = fork();
if (pid == 0) { /* Parent processus */ if (pid == 0) { /* Parent process */
pid_t parent_pid = getpid(); pid_t parent_pid = getpid();
printf("Parent processus: pid=%d\n", parent_pid); printf("Parent process: pid=%d\n", parent_pid);
/* Setup CPU for process */ /* Setup CPU for process */
CPU_SET(child_cpu, &set); CPU_SET(child_cpu, &set);
@@ -99,11 +99,11 @@ int main(int argc, char* argv[]) {
memset(buffer, 0, sizeof(buffer)); memset(buffer, 0, sizeof(buffer));
} }
} else if (pid > 0) { /* Child processus */ } else if (pid > 0) { /* Child process */
pid_t child_pid = getpid(); pid_t child_pid = getpid();
printf("Child processus: pid=%d\n", child_pid); printf("Child process: pid=%d\n", child_pid);
/* Setup CPU affinity for processus */ /* Setup CPU affinity for process */
CPU_SET(parent_cpu, &set); CPU_SET(parent_cpu, &set);
int ret = sched_setaffinity(child_pid, sizeof(set), &set); int ret = sched_setaffinity(child_pid, sizeof(set), &set);
if (ret == -1) { if (ret == -1) {
@@ -111,7 +111,7 @@ int main(int argc, char* argv[]) {
exit(EXIT_FAILURE); exit(EXIT_FAILURE);
} }
/* Write messages for the parent processus */ /* Write messages for the parent process */
for (int i = 0; i < NBR_MSG; i++) { for (int i = 0; i < NBR_MSG; i++) {
write(fd[0], MSG[i], strlen(MSG[i])); write(fd[0], MSG[i], strlen(MSG[i]));
} }