• This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn more.
  • Announcement - March 5th 12:17 PM GMT

    Hi there Guest!
    Thanks for checking out Silph Co.! Right now things are very much still in development with things like themes, guidelines, rules and most importantly content, still being a WIP. The staff and our members are actively working to make the community more homey for you. In the mean time, we are welcoming feedback and suggestions if you have them in the Feedback section.
    Please read the forum rules before posting.

Tutorial easyaspi314's C Tutorial

Was this tutorial helpful?

  • No. (Please let me know what's wrong!)

    Votes: 0 0.0%
  • Sort of… (Also let me know!)

    Votes: 0 0.0%

  • Total voters
    4
#1
Before you start hacking around with the disassemblies, you need to know how to program in C.

C is a (fairly) simple language, but it can be confusing at first. If you know C++, JavaScript, PHP, or Java, a lot of this will seem familiar.

I will help get you started.

There are many different tutorials, so if you don't understand, try looking at some other tutorials.

But, please, let me know what you think and if this tutorial was helpful.

Lesson 0: Setting Up
So, because Pokeruby and the GBA are very inefficient and inflexible for programming, we are going to be starting off compiling native code to run on your computer.

In order to do this, you need either (preferably) Clang (part of LLVM) or GCC. Technically, you can use any C compiler, but trying to juggle the command line options is confusing.

If you are using Clang, substitute gcc for clang.

On Windows, you should install and use MinGW via MSYS2
On Linux, you probably have GCC installed already. If not, install it from your package manager.
On macOS, install iTerm2 (the default terminal is inefficient), and run xcode-select --install, and if it doesn't say that the tools are already installed, follow the instructions.

You should be able to open a terminal and type gcc --version and see something similar to this:
Code:
gcc-7 (Homebrew GCC 7.3.0_1) 7.3.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Or this (may also say clang instead of Apple LLVM):
Code:
Apple LLVM version 8.0.0 (clang-800.0.42.1)
Target: x86_64-apple-darwin15.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
You also want a decent text editor with syntax highlighting. Here are some suggestions:

Note that if you prefer watching a video, I have made an unscripted 30 minute video with similar information as well as a few later lessons. It is not a direct transcript of this post, so it is still worth watching. You can watch it here:

Lesson 1: Hello, world!
First, you want to make a directory for these examples. I suggest something you can find without any spaces, such as ~/c-tutorials on Mac/Linux or C:\Users\username\Documents\c-tutorials\ on Windows.

Now, open a terminal and navigate to that folder.
Do this by typing cd, space, then the file path.
On Windows, you need to change the path name from eg C:\Users\username\Documents\c-tutorials to /c/Users/username/Documents/c-tutorials

Ok, now create a new text file in that folder called "hello.c" with the text editor of your choice.

Paste the following lines into it and save:
C:
/*
* Tutorial 1: Hello, world!
* Filename: hello.c
* Run as:
* gcc -Wall hello.c -o hello && ./hello
*/

/**
* Include the system header stdio.h
* for the printf function below.
*/
#include <stdio.h>

/**
* Declare and implement a function called main,
* which takes no arguments and returns int.
*
* main is the function that is called at the very start
* of the program.
*/
int main(void)
// brace opens the main function block
{
    // Print "Hello, world!" plus a new line to the terminal
    printf("Hello, world!\n");
    // Set the return value of main to 0, meaning the program ended OK
    return 0;
// Close the brace above
} // end function main

Now, go back to your terminal, and type this:
gcc -Wall hello.c -o hello && ./hello

If all things go out right, you should see this:
Code:
Hello, world!
Ok, now let's talk about what happened.

First of all, let's talk about comments. Comments are notes that you can put in a file that are ignored by the compiler. There are two types of comments in C, block comments and line comments.

Line comments start with // and end at the end of the line.
Block comments start with /* and end with */. Anything between these two marks is a comment, whether it is in the middle of the line, the end of the line, or across multiple lines. Note that when you are trying to make a paragraph to document something, it is a good idea to format it like this:
C:
/**
* This is a comment.
* It spans multiple lines.
* It documents the line below.
* Note the two stars at the top.
*/
codeLineToComment
Another thing to know is that unlike, say, Python, formatting doesn't matter in C, with the exception of any lines that start with #. The only thing that matters is that the words are in tact. But, please, keep it consistent.

Anyways, so after that comment is this:
C:
#include <stdio.h>
.
This includes all of the functions from the system header stdio.h. We specifically need the printf function. More on that later.

C:
int main(void)
This is a function declaration. You can identify the function signature like so:
function declaration.png

A function is a set of instructions that can be reused. Think of it like a recipe. It can take in values and spit out up to one value.

main is a special function. When you start a program, it calls the function main and everything happens, and eventually ends (unless there is a problem) in main,

Note: In C, all functions need a unique name. Unlike C++ or Java, C doesn't have overloading, which is when you have two functions with the same name and different arguments.

C:
{
...
}
This is a block of code. Because it comes directly after the signature of main, it becomes the function implementation. It is marked by curly braces, or {}. Every open brace needs a close braces, and what happens in the braces stays in the braces.

Let's go into the block.
C:
printf("Hello, world!\n");
Here is a function call. It is one type of statement. It tells the compiler to go to that function and run the function printf with the argument being a string containing Hello, world! followed by a newline (\n). Strings are in double quotes. A function call is similar to a function declaration, but instead of type names like int and void, you use actual values.

printf is a function that prints a string and a few arguments to the console. It has a few features, but there are so many of them that I won't cover it for now. So, what we do is print "Hello, world!" to the console, and put a newline at the end for formatting. The newline prevents this:
Code:
bash ~/c-tutorials $ ./hello
Hello, world!bash ~/c-tutorials $
.

All statements in C end in a semicolon. You can't leave it out like JavaScript. Think about it like a period at the end of a sentence.
For example, if we leave out the semicolons (bad code below)…
C:
#include <stdio.h>

int main(void) {
    printf("Hello, world!\n") // no semicolon
    return 0 // no semicolon
}
If you use GCC, it isn't the greatest at showing errors and will say something like this:
Code:
hello.c: In function 'main':
hello.c:5:5: error: expected ';' before 'return'
     return 0
     ^~~~~~
but Clang is a bit more helpful:
Code:
hello.c:4:30: error: expected ';' after expression
    printf("Hello, world!\n")
                             ^
                             ;
hello.c:5:13: error: expected ';' after return statement
    return 0
            ^
            ;
2 errors generated.
This is why I recommend Clang if possible, as it just has better errors.

Anyways, back on topic, the next line:
C:
return 0;
return will jump to the end of a function. If the return type of the function is not void, you have to put a value after it and you can't leave it out (the compiler will complain with the message "control reaches end of non-void function" — memorize it, it is a terrible error message). If it is void, you have to not return a value and you can leave it out. If you return early from a function, make sure you wrap it in a conditional or the compiler will complain about unreachable code.
To return or not to return? That is the rather complicated question.
C:
int returningInt(void) {
    return 2; // okay, returns an int
}
void notReturningAnything(void) {
    return; // okay, but not required because it is at the end of the void function
}
void dontNeedToReturn(void) {
    // okay, the end of void functions don't need a return statement
}
int shouldReturn(void) {
    // not okay, this needs a return statement because it isn't void.
    // this errors as "control reaches end of non-void function".
}
void shouldNotBeReturningAnything(void) {
    return 3; // not okay, this return statement returns int, but the void function should not return a value
}
int shouldReturnInt(void) {
    return; // not okay, this return statement expects a value because the function returns int
}
int shouldAlsoReturnInt(void) {
    return "totally an int and not a string"; // not okay, this is a string, not an int!
}
void shouldntBeLeavingEarly(void) {
    return; // not okay, the stuff after this is not used
    doSomething();
}
void couldBeLeavingEarly(void) {
    if (condition) { // more on these later!
        return; // okay because this only might happen
    }
    doSomething();
}
int couldNotReturnAValue(void) {
    if (condition) {
        return 4;
    }
    // not okay, if the condtion is false, we reach the end of the function and do not return a value.
    // this errors as "control may reach end of non-void function".
}
int elseReturn(void) {
    if (condition) {
        return 4;
    } else {
        return 2;
    }
    // okay, because the else catches all possible conditions.
}
main, being a special function, lets you set the exit code. If you programmed in Java, it is the same as calling exit(x). Whatever number you put after a return statement in main is the exit code.

To check this, in bash, run ./hello; echo $? (note the semicolon, not the double ampersand), mess with the return 0 and watch the number change.

This all ends in that closing brace. Congratulations, you have created a "Hello, world!" program and completely beaten the concept to death (don't worry, the later lessons are much shorter)!

Lesson 2: Break it up!
I am frustrated because my computer froze up, I needed to restart, and I lost all of Lesson 2.:cry:

Ahem.
Now, you can put all your code in main. But your code will become so disorganized and ugly that it isn't even funny.

So, we break things up into separate functions.

Copy hello.c to hello2.c and let's make a new function, called printHelloWorld.
Note that I removed the extra comments that you would never see outside of a tutorial.

C:
/*
* Tutorial 2: Break it up!
* Filename: hello2.c
* Run as:
* gcc -Wall hello2.c -o hello2 && ./hello2
*/

#include <stdio.h> // for printf

/**
* main is the function that is called at the very start
* of the program.
*/
int main(void)
{
    printHelloWorld(); // Note that we just use empty parentheses despite the void
    return 0;
} // end function main

/**
* Prints "Hello, world!" to the console.
*/
void printHelloWorld(void)
{
    printf("Hello, world!\n");
    // Because this is a void statement, we don't need a return value.
} // end function printHelloWorld
The same thing happens, but it is broken up into a separate function.

Cool, huh?

Well, I lied.

Code:
hello2.c: In function 'main':
hello2.c:16:5: warning: implicit declaration of function 'printHelloWorld' [-Wimplicit-function-declaration]
     printHelloWorld(); // Note that we just use empty parentheses despite the void
     ^~~~~~~~~~~~~~~
hello2.c: At top level:
hello2.c:23:6: warning: conflicting types for 'printHelloWorld'
void printHelloWorld(void)
      ^~~~~~~~~~~~~~~
hello2.c:16:5: note: previous implicit declaration of 'printHelloWorld' was here
     printHelloWorld(); // Note that we just use empty parentheses despite the void
     ^~~~~~~~~~~~~~~
agbcc will say this:
Code:
hello2.i:357: warning: type mismatch with previous implicit declaration
hello2.i:349: warning: previous implicit declaration of `printHelloWorld'
hello2.i:357: warning: `printHelloWorld' was previously implicitly declared to return `int'
The compiler will complain.

Why is this? I'll tell you, but first, let's talk about parallel universes the compiler.

The C compiler is quirky. First of all, it reads from top to bottom.

So, unlike Java, you can't use a function that isn't declared first without a complaint (GCC) or an error (Clang). At the point where we call printHelloWorld() in main, the compiler doesn't know what printHelloWorld is. We are trying to make an implicit declaration, which is bad.

So, there are two ways to fix this.

The first is to reorder how we declare the function so that we put it above main, like this:
C:
/*
* Tutorial 2: Break it up!
* Filename: hello2.c
* Run as:
* gcc -Wall hello2.c -o hello2 && ./hello2
*/

#include <stdio.h> // for printf

/**
* Prints "Hello, world!" to the console.
*/
void printHelloWorld(void)
{
    printf("Hello, world!\n");
    // Because this is a void statement, we don't need a return value.
} // end function printHelloWorld

/**
* main is the function that is called at the very start
* of the program.
*/
int main(void)
{
    printHelloWorld(); // Note that we just use empty parentheses despite the void
    return 0;
} // end function main
Compile and run, and it works fine.

But, things can get complicated when you do that.

The other way of doing this is a forward declaration.

Forward declarations look like this:
C:
void printHelloWorld(void);
So, let's take our example again and use a forward declaration:
C:
/*
* Tutorial 2a: Break it up!
* Filename: hello2.c
* Run as:
* gcc -Wall hello2.c -o hello2 && ./hello2
*/

#include <stdio.h> // for printf

// Forward declare printHelloWorld
void printHelloWorld(void);

/**
* main is the function that is called at the very start
* of the program.
*/
int main(void)
{
    printHelloWorld(); // Note that we just use empty parentheses despite the void
    return 0;
} // end function main

/**
* Prints "Hello, world!" to the console.
*/
void printHelloWorld(void)
{
    printf("Hello, world!\n");
    // Because this is a void statement, we don't need a return value.
} // end function printHelloWorld
But, we can break this up even more. Because even this can get messy.

Back to the parallel universes compiler.

C is a compiled language. Unlike, say, a Bash or Python script (an interpreted language) which is interpreted by another program, C code needs to be compiled to a program.

C was designed to be portable. Cross platform, cross processor, fast, simple, yet still powerful. In order to do that, it needs to do a few steps:

  1. Preprocessing (cpp/gcc -E): The compiler strips away comments, expands macros, isolates specific code, and replaces include statements with the content of the file. This creates a preprocessed C file (.i).
  2. Compiling (cc1/gcc -S): The compiler takes the C code, does its optimizations and translates it to native assembly. While it isn't important, Clang does two steps here. This creates an assembly file (.s). agbcc is a cc1 compiler, it needs preprocessed code.
  3. Assembling (as/gcc -c): The compiler takes assembly files and converts it to a binary. This creates an object file (.o).
  4. Linking (ld): The compiler takes all of the .o files and mashes them into a single file. This either makes an executable that you can run (.exe on Windows, no extension or .out on Mac/Linux) or a library (.dll on Windows, .so on Linux, .dylib on Mac) that you can use in other programs.

We're going to look at the preprocessor.

Preprocessor statements start with a # and are the only things in C that have to be on one line and only one line.

The preprocessor statement we are looking at is #include, called an include statement.

Include statements are very basic. They simply take a file and dump its contents in the file that includes it.

We almost always use this to include a header file, which contains a bunch of forward declarations, but you can technically use it for any file.

There are two variations of them: Angled brackets (< and >), and quotes (I think you know what quotes are).

Angled brackets are to include a system or library header. In this case, I am including stdio.h from the C Standard Library, also known as libc.

These headers are found in /usr/include on Linux, *gasp* /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/include *gasp* on Mac (/usr/include has unused GCC headers), or on Windows, I believe it is in the include folder in your MSYS2 installation. You can add more paths to this with the -I option.

The include statement we were using was #include <stdio.h>.

The other version, quotes, includes a file relative to the directory, or, alternatively, relative to a path that you specify with -iquote. We use these for our own headers.

If we run the preprocessor on hello2.c like this: gcc -E -P hello2.c, your console will be bombarded by a bunch of code, then the code in hello2.c with comments stripped out. You will find out that this is exactly what I said, the preprocessed contents of stdio.h in that include directory. The -E tells GCC to only use the preprocessor, and the -P tells GCC's preprocessor to not add a bunch of debug info. It is useful normally, but not when we are to analyze it.

But, for now, let's work on making our own header.

Make a new file, called hello2.h, in the same folder as hello2.c. Header files in C always end in .h.

Cut and paste that forward declaration from hello2.c and put it in hello2.h.
C:
// Forward declare printHelloWorld
extern void printHelloWorld(void);
And replace those lines in hello2.c with this:
C:
#include "hello2.c"
And comment out (put two slashes before) the
C:
#include <stdio.h>
temporarily so we can better see what is happening.

Run gcc -E -P hello2.c again and tada! (note that Clang likes to put a bunch of blank lines in the output):
C:
void printHelloWorld(void);
int main(void)
{
    printHelloWorld();
    return 0;
}
void printHelloWorld(void)
{
    printf("Hello, world!\n");
}
Uncomment (remove the two slashes before) the I told you to comment out and run GCC normally, gcc -Wall hello2.c -o hello2 && ./hello2. Done.

However. I lied again. There is a terrible mistake waiting to happen.

Make three new files, hello_goodbye.c, hello.h, and goodbye.h with the following contents, respectively:
C:
/*
* Tutorial 2a: Hello, Goodbye!
* Filename: hello_goodbye.c
* Run as:
* gcc -Wall hello_goodbye.c -o hello_goodbye && ./hello_goodbye
*/
#include <stdio.h> // for printf
#include "hello.h" // for printHelloWorld
#include "goodbye.h" // for printGoodbyeWorld

int main(void)
{
    printHelloWorld();
    printGoodbyeWorld();
    return 0;
}

/**
* Prints "Hello, world!" to the console
*/
void printHelloWorld(void)
{
    printf("Hello, world!\n");
}

/**
* Prints "Goodbye, world!" to console.
*
* This s*** got dark.
*/
void printGoodbyeWorld(void)
{
    printf("Goodbye, world!\n");
}
C:
/* Filename: hello.h */
#include "goodbye.h"

void printHelloWorld(void);
C:
/* Filename: goodbye.h */
#include "hello.h"

void printGoodbyeWorld(void);
Code:
~/c-tutorials $ gcc -Wall hello_goodbye.c -o hello_goodbye && ./hello_goodbye
In file included from goodbye.h:2:0,
                 from hello.h:2,
                 from goodbye.h:2,
                 from hello.h:2,
                 from goodbye.h:2,
                 from hello.h:2,
                 from goodbye.h:2,
                 from hello.h:2,
                 from goodbye.h:2,
                 from hello.h:2,
                 ... snip, this goes on for a while
                 from hello_goodbye.c:9:
hello.h:2:21: error: #include nested too deeply
#include "goodbye.h"
                     ^
In file included from hello.h:2:0,
                 from goodbye.h:2,
                 from hello.h:2,
                 from goodbye.h:2,
                 from hello.h:2,
                 from goodbye.h:2,
                 from hello.h:2,
                 from goodbye.h:2,
                 from hello.h:2,
                 from goodbye.h:2,
                 from hello.h:2,
                 from goodbye.h:2,
                 from hello.h:2,
                 ... snip
                 from goodbye.h:2,
                 from hello.h:2,
                 from hello_goodbye.c:10:
goodbye.h:2:19: error: #include nested too deeply
#include "hello.h"
                   ^
AAAAAAHHHHHH!!!!!! PANIC! PANIC! PANIC!


We got ourselves a recursive include. hello.h and goodbye.h both included each other. The C preprocessor didn't know what to do!

Now, we can be supercautious about what files we include, or we can make an include guard.

Just like they told you in health class, always use protection!

To make an include guard, we wrap our header files somewhat like what's below (I personally use the naming scheme below, but I think pokeruby uses GUARD_FILE_NAME). Make sure the name is unique, though. You can't just call them all GUARD or whatever.

This tells the preprocessor to only include this file once.
C:
/* Filename: hello.h */

/**
* If the guard for this header, in this case, HELLO_H, is not defined (set),
* include the contents below as it hasn't been included already, until
* the #endif marking the end of this include-once header.
*
* Otherwise, skip it.
*/
#ifndef HELLO_H
/**
* Declare that the file was included by setting a preprocessor flag named HELLO_H,
* to serve as the guard for this header.
*/
#define HELLO_H

#include "goodbye.h"

void printHelloWorld(void);

/**
* Mark that this is the end of the HELLO_H preprocessor statement,
* and as a result of this, the end of the code to include once.
*/
#endif // HELLO_H
For goodbye.h, I am not going to comment it, as nobody comments include guards.
C:
/* Filename: goodbye.h */

#ifndef GOODBYE_H
#define GOODBYE_H

#include "hello.h"

void printGoodbyeWorld(void);
#endif // GOODBYE_H
Try compiling it again, and it's all good!

For good measure, do the same thing for hello2.h.

Later parts coming soon!
 
Last edited:

Lunos

Well-known member
#4
C:
int main(void)
This is a function declaration. You can identify the function signature like so:
View attachment 193
Broken attachment.
That aside, I'll be looking forward to these tutorials. I don't think that I'll be able to do anything with them, but at least I'd like to give it a try.
EDIT: Btw, I just noticed that after executing gcc -Wall hello.c -o hello && ./hello, a "Hello.exe" file was created in the same directory where "Hello.c" is. I suppose you'll explain that effect at some point, right? I'm curious.
 
#5
Broken attachment.
Damn it. 😂
I'll fix it when I get a chance.
That aside, I'll be looking forward to these tutorials. I don't think that I'll be able to do anything with them, but at least I'd like to give it a try.
And that is OK. C isn't for everyone.
: Btw, I just noticed that after executing gcc -Wall hello.c -o hello && ./hello, a "Hello.exe" file was created in the same directory where "Hello.c" is. I suppose you'll explain that effect at some point, right? I'm curious.
Yes, that is a compiled executable. Unlike, say, a Bash or Python script which is interpreted by another program, C code needs to be compiled directly to a program.

C was designed to be portable. Cross platform, cross processor, fast, simple, yet still powerful. In order to do that, it needs to do a few steps:

  1. Preprocessing (cpp/gcc -E): The compiler strips away comments, expands macros, isolates specific code, and replaces include statements with the content of the file. This creates a preprocessed C file (.i).
  2. Compiling (cc1/gcc -S): The compiler takes the C code, does its optimizations and translates it to native assembly. While it isn't important, Clang does two steps here. This creates an assembly file (.s).
  3. Assembling (as/gcc -c): The compiler takes assembly files and converts it to a binary. This creates an object file (.o).
  4. Linking (ld): The compiler takes all of the .o files and mashes them into a single file. This either makes an executable that you can run (.exe on Windows, no extension or .out on Mac/Linux) or a library (.dll on Windows, .so on Linux, .dylib on Mac) that you can use in other programs.
 
#7
I made a poll. Please let me know if the tutorial was helpful, as without your feedback, I don't know if I am actually helpful or if I am talking nonsense.

By the way, @Lunos, you said you were having trouble before, have things cleared up or do you need more help?