Tag: binary

  • New C test program for chastelib

    In my last post, I showed the test program for the Rust version of chastelib. I decided it would make sense to design a similar test program that uses the original C version of the library that I used when converting to Rust. Personally I like this version better. Global variables makes the code cleaner in my opinion because otherwise how would I choose which order the arguments to the functions would go in? This way, the radix and width of the integer string are set by global data before calling the putint function.

    Also notice that the b variable is set to an integer by a string using strint. This is only to demonstrate proper use of the function. Normally it would not be used unless getting string input from a user by command line argument such as was done in chastehex.

    In any case, this library controls all the integer and string conversion so that I can use it in larger projects. Because it is in ANSI C, it is portable to any machine that exists in modern times.

    main.c

    #include <stdio.h>
    #include <stdlib.h>
    #include "chastelib.h"
    
    int main(int argc, char *argv[])
    {
     int a=0,b;
    
     radix=16;
     int_width=1;
    
     putstring("This is the official test program for the C version of chastelib.\n");
     b=strint("100");
    
     putstring("Hello World!\n");
     
     while(a<b)
     {
      radix=2;
      int_width=8;
      putint(a);
      putstring(" ");
      radix=16;
      int_width=2;
      putint(a);
      putstring(" ");
      radix=10;
      int_width=3;
      putint(a);
    
      if(a>=0x20 && a<=0x7E)
      {
       putstring(" ");
       putchar(a);
      }
    
      putstring("\n");
      a+=1;
     }
      
     return 0;
    }
    

    Below is the command to compile and run it and the output.

    gcc -Wall -ansi -pedantic main.c -o main && ./main
    This is the official test program for the C version of chastelib.
    Hello World!
    00000000 00 000
    00000001 01 001
    00000010 02 002
    00000011 03 003
    00000100 04 004
    00000101 05 005
    00000110 06 006
    00000111 07 007
    00001000 08 008
    00001001 09 009
    00001010 0A 010
    00001011 0B 011
    00001100 0C 012
    00001101 0D 013
    00001110 0E 014
    00001111 0F 015
    00010000 10 016
    00010001 11 017
    00010010 12 018
    00010011 13 019
    00010100 14 020
    00010101 15 021
    00010110 16 022
    00010111 17 023
    00011000 18 024
    00011001 19 025
    00011010 1A 026
    00011011 1B 027
    00011100 1C 028
    00011101 1D 029
    00011110 1E 030
    00011111 1F 031
    00100000 20 032  
    00100001 21 033 !
    00100010 22 034 "
    00100011 23 035 #
    00100100 24 036 $
    00100101 25 037 %
    00100110 26 038 &
    00100111 27 039 '
    00101000 28 040 (
    00101001 29 041 )
    00101010 2A 042 *
    00101011 2B 043 +
    00101100 2C 044 ,
    00101101 2D 045 -
    00101110 2E 046 .
    00101111 2F 047 /
    00110000 30 048 0
    00110001 31 049 1
    00110010 32 050 2
    00110011 33 051 3
    00110100 34 052 4
    00110101 35 053 5
    00110110 36 054 6
    00110111 37 055 7
    00111000 38 056 8
    00111001 39 057 9
    00111010 3A 058 :
    00111011 3B 059 ;
    00111100 3C 060 <
    00111101 3D 061 =
    00111110 3E 062 >
    00111111 3F 063 ?
    01000000 40 064 @
    01000001 41 065 A
    01000010 42 066 B
    01000011 43 067 C
    01000100 44 068 D
    01000101 45 069 E
    01000110 46 070 F
    01000111 47 071 G
    01001000 48 072 H
    01001001 49 073 I
    01001010 4A 074 J
    01001011 4B 075 K
    01001100 4C 076 L
    01001101 4D 077 M
    01001110 4E 078 N
    01001111 4F 079 O
    01010000 50 080 P
    01010001 51 081 Q
    01010010 52 082 R
    01010011 53 083 S
    01010100 54 084 T
    01010101 55 085 U
    01010110 56 086 V
    01010111 57 087 W
    01011000 58 088 X
    01011001 59 089 Y
    01011010 5A 090 Z
    01011011 5B 091 [
    01011100 5C 092 \
    01011101 5D 093 ]
    01011110 5E 094 ^
    01011111 5F 095 _
    01100000 60 096 `
    01100001 61 097 a
    01100010 62 098 b
    01100011 63 099 c
    01100100 64 100 d
    01100101 65 101 e
    01100110 66 102 f
    01100111 67 103 g
    01101000 68 104 h
    01101001 69 105 i
    01101010 6A 106 j
    01101011 6B 107 k
    01101100 6C 108 l
    01101101 6D 109 m
    01101110 6E 110 n
    01101111 6F 111 o
    01110000 70 112 p
    01110001 71 113 q
    01110010 72 114 r
    01110011 73 115 s
    01110100 74 116 t
    01110101 75 117 u
    01110110 76 118 v
    01110111 77 119 w
    01111000 78 120 x
    01111001 79 121 y
    01111010 7A 122 z
    01111011 7B 123 {
    01111100 7C 124 |
    01111101 7D 125 }
    01111110 7E 126 ~
    01111111 7F 127
    10000000 80 128
    10000001 81 129
    10000010 82 130
    10000011 83 131
    10000100 84 132
    10000101 85 133
    10000110 86 134
    10000111 87 135
    10001000 88 136
    10001001 89 137
    10001010 8A 138
    10001011 8B 139
    10001100 8C 140
    10001101 8D 141
    10001110 8E 142
    10001111 8F 143
    10010000 90 144
    10010001 91 145
    10010010 92 146
    10010011 93 147
    10010100 94 148
    10010101 95 149
    10010110 96 150
    10010111 97 151
    10011000 98 152
    10011001 99 153
    10011010 9A 154
    10011011 9B 155
    10011100 9C 156
    10011101 9D 157
    10011110 9E 158
    10011111 9F 159
    10100000 A0 160
    10100001 A1 161
    10100010 A2 162
    10100011 A3 163
    10100100 A4 164
    10100101 A5 165
    10100110 A6 166
    10100111 A7 167
    10101000 A8 168
    10101001 A9 169
    10101010 AA 170
    10101011 AB 171
    10101100 AC 172
    10101101 AD 173
    10101110 AE 174
    10101111 AF 175
    10110000 B0 176
    10110001 B1 177
    10110010 B2 178
    10110011 B3 179
    10110100 B4 180
    10110101 B5 181
    10110110 B6 182
    10110111 B7 183
    10111000 B8 184
    10111001 B9 185
    10111010 BA 186
    10111011 BB 187
    10111100 BC 188
    10111101 BD 189
    10111110 BE 190
    10111111 BF 191
    11000000 C0 192
    11000001 C1 193
    11000010 C2 194
    11000011 C3 195
    11000100 C4 196
    11000101 C5 197
    11000110 C6 198
    11000111 C7 199
    11001000 C8 200
    11001001 C9 201
    11001010 CA 202
    11001011 CB 203
    11001100 CC 204
    11001101 CD 205
    11001110 CE 206
    11001111 CF 207
    11010000 D0 208
    11010001 D1 209
    11010010 D2 210
    11010011 D3 211
    11010100 D4 212
    11010101 D5 213
    11010110 D6 214
    11010111 D7 215
    11011000 D8 216
    11011001 D9 217
    11011010 DA 218
    11011011 DB 219
    11011100 DC 220
    11011101 DD 221
    11011110 DE 222
    11011111 DF 223
    11100000 E0 224
    11100001 E1 225
    11100010 E2 226
    11100011 E3 227
    11100100 E4 228
    11100101 E5 229
    11100110 E6 230
    11100111 E7 231
    11101000 E8 232
    11101001 E9 233
    11101010 EA 234
    11101011 EB 235
    11101100 EC 236
    11101101 ED 237
    11101110 EE 238
    11101111 EF 239
    11110000 F0 240
    11110001 F1 241
    11110010 F2 242
    11110011 F3 243
    11110100 F4 244
    11110101 F5 245
    11110110 F6 246
    11110111 F7 247
    11111000 F8 248
    11111001 F9 249
    11111010 FA 250
    11111011 FB 251
    11111100 FC 252
    11111101 FD 253
    11111110 FE 254
    11111111 FF 255
    
    

    Finally, here is the source to the library itself which was included by main.c at the top of the post.

    chastelib.h

    /*
    This file is a library of functions written by Chastity White Rose. The functions are for converting strings into integers and integers into strings. I did it partly for future programming plans and also because it helped me learn a lot in the process about how pointers work as well as which features the standard library provides and which things I need to write my own functions for.
    */
    
    /* These two lines define a static array with a size big enough to store the digits of an integer including padding it with extra zeroes. The function which follows always returns a pointer to this global string and this allows other standard library functions such as printf to display the integers to standard output or even possibly to files.*/
    
    #define usl 32
    char int_string[usl+1]; /*global string which will be used to store string of integers*/
    
     /*radix or base for integer output. 2=binary, 8=octal, 10=decimal, 16=hexadecimal*/
    int radix=2;
    /*default minimum digits for printing integers*/
    int int_width=1;
    
    /*
    This function is one that I wrote because the standard library can display integers as decimai, octai, or hexadecimal but not any other bases(including binary which is my favorite). My function corrects this and in my opinion such a function should have been part of the standard library but I'm not complaining because now I have my own which I can use forever!
    */
    
    char* intstr(unsigned int i)
    {
     int width=0;
     char *s=int_string+usl;
     *s=0;
     while(i!=0 || width<int_width)
     {
      s--;
      *s=i%radix;
      i/=radix;
      if(*s<10){*s+='0';}else{*s=*s+'A'-10;}
      width++;
     }
    
     return s;
    }
    
    /*
    This function is my own replacement for the strtol function from the C standard library. I didn't technically need to make this function because the functions from stdlib.h can already convert strings from bases 2 to 36 into integers. However my function is simpler because it only requires 2 arguments instead of three and it also does not handle negative numbers. Never have I needed negative integers but if I ever do I can use the standard functions or write my own in the future.
    */
    
    int strint(char *s)
    {
     int i=0;
     char c;
     if( radix<2 || radix>36 ){printf("Error: radix %i is out of range!\n",radix);return i;}
     while( *s == ' ' || *s == '\n' || *s == '\t' ){s++;} /*skip whitespace at beginning*/
     while(*s!=0)
     {
      c=*s;
      if( c >= '0' && c <= '9' ){c-='0';}
      else if( c >= 'A' && c <= 'Z' ){c-='A';c+=10;}
      else if( c >= 'a' && c <= 'z' ){c-='a';c+=10;}
      else if( c == ' ' || c == '\n' || c == '\t' ){return i;}
      else{printf("Error: %c is not an alphanumeric character!\n",c);return i;}
      if(c>=radix){printf("Error: %c is not a valid character for radix %i\n",*s,radix);return i;}
      i*=radix;
      i+=c;
      s++;
     }
     return i;
    }
    
    /*
    this function prints a string using fwrite
    This is the best C representation of how my Assembly programs also work/
    */
    
    void putstring(char *s)
    {
     int c=0;
     char *p=s;
     while(*p++){c++;} 
     fwrite(s,1,c,stdout);
    }
    
    void putint(unsigned int i)
    {
     putstring(intstr(i));
    }
    
    
  • Chastity’s Source for ELF 32-bit executable creation

    I wrote an example program using a custom made ELF-32 header using data declaration statements. I wrote many comments which reference the official specification. This was an exercise both in programming and also Technical Reading and Writing. I had to read enough of the specification PDF file to understand what I was doing. I then tried to write descriptive comments that at the very least I would understand when I need to remind myself how this format is created.

    ;Chastity's Source for ELF 32-bit executable creation
    ;
    ;All data as defined in this file is based off of the specification of the ELF file format.
    ;I first looked at the type of file created by FASM's "format ELF executable" directive.
    ;It is great that FASM can create an executable file automatically. (Thanks Tomasz Grysztar, you are a true warrior!)
    ;However, I wanted to understand the format for theoretical use in other assemblers like NASM.
    
    ;The Github repository with the spec I used is here.
    ;<https://github.com/xinuos/gabi>
    ;And this is the wikipedia article which linked me to the specification document
    ;<https://en.wikipedia.org/wiki/Executable_and_Linkable_Format>
    
    ;This file contains a raw binary ELF32 header created using db,dw,dd commands.
    ;After that, it proceeds to assemble a real "Hello World!" program
    
    ;Header for 32 bit ELF executable (with comments based on specification)
    
    db 0x7F,"ELF" ;ELFMAGIC: 4 bytes that identify this as an ELF file. The magic numbers you could say.
    db 1          ;EI_CLASS: 1=32-bit 2=64-bit
    db 1          ;EI_DATA: The endianness of the data. 1=ELFDATA2LSB 2=ELFDATA2MSB For Intel x86 this is always 1 as far as I know.
    db 1          ;EI_VERSION: 1=EV_CURRENT (ELF identity version 1) (which is current at time of specification Version 4.2 I was using)
    db 9 dup 0    ;padding zeros to bring us to address 0x10
    dw 2          ;e_type: 2=ET_EXEC (executable instead of object file)
    dw 3          ;e_machine : 3=EM_386 (Intel 80386)
    dd 1          ;e_version: 1=EV_CURRENT (ELF object file version.)
    
    p_vaddr=0x8048000
    e_entry=0x8048054 ;we will be reusing this constant later 
    
    dd e_entry    ;e_entry: the virtual address at which the program starts
    dd 0x34       ;e_phoff: where in the file the program header offset is
    db 8 dup 0    ;e_shoff and e_flags are unused in this example,therefore all zeros
    dw 0x34       ;e_ehsize: size of the ELF header
    dw 0x20       ;e_phentsize: size of program header which happens after ELF header
    dw 1          ;e_phnum: How many program headers. Only 1 in this case
    dw 0x28       ;e_shentsize: Size of a section header
    dw 0          ;e_shnum number of section headers
    dw 0          ;e_shstrndx: section header string index (not used here)
    
    ;That is the end of the 0x34 byte (52 bytes decimal) ELF header. Sadly, this is not the end and a program header is also required (what drunk person made this format?)
    
    dd 1          ;p_type: 1=PT_LOAD
    dd 0          ;p_offset: Base address from file (zero)
    dd p_vaddr    ;p_vaddr: Virtual address in memory where the file will be.
    dd p_vaddr    ;p_paddr: Physical address. Same as previous
    
    image_size=0x1000 ;Chosen size for file and memory size. At minimum this must be as big as the actual binary file (code after header included)
                      ;By choosing a default size of 0x1000, I am assuming all assembly programs I write will be less than 4 kilobytes
    
    dd image_size  ;p_filesz: Size of file image of the segment. Must be equal to the file size or greater
    dd image_size  ;p_memsz: Size of memory image of the segment, which may be equal to or greater than file image.
    
    dd 7           ;p_flags: permission flags: 7=4(Read)+2(Write)+1(Execute)
    dd 0           ;p_align; Alignment (none)
    
    ;important FASM directives
    use32          ;tell assembler that 32 bit code is being used
    org e_entry    ;origin of new code begins at the entry point
    
    ;now, the actual hello world program
    mov eax,4      ;invoke SYS_WRITE (kernel opcode 4 on 32 bit systems)
    mov ebx,1      ;write to the STDOUT file
    mov ecx,msg    ;pointer/address of string to write
    mov edx,13     ;number of bytes to write
    int 80h
    
    mov eax,1 ;function SYS_EXIT (kernel opcode 1 on 32 bit systems)
    mov ebx,0 ;return 0 status on exit - 'No Errors'
    int 80h   ;call Linux kernel with interrupt
    
    msg db 'Hello World!',0Ah
    
    ;This is the makefile I use when assembling and running this program
    
    ;main-fasm:
    ;	fasm ELF-32-hello.asm
    ;	chmod +x ELF-32-hello.bin
    ;	./ELF-32-hello.bin
    
    

    I made a repository for examples like this. Others may want to understand the header used on Linux systems.

    https://github.com/chastitywhiterose/ELF

  • The Bitwise Operations

    There are 5 bitwise operations which operate on the bits of data in a computer. For the purpose of demonstration, it doesn’t matter which number the bits represent at the moment. This is because the bits don’t have to represent numbers at all but can represent anything described in two states. Bits are commonly used to represent statements that are true or false. For the purposes of this section, the words AND, OR, XOR are in capital letters because their meaning is only loosely related to the Englist words they get their name from.

    Bitwise AND Operation

    0 AND 0 == 0
    0 AND 1 == 0	
    1 AND 0 == 0
    1 AND 1 == 1
    

    Think of the bitwise AND operation as multiplication of single bits. 1 times 1 is always 1 but 0 times anything is always 0. That’s how I personally think of it. I guess you could say that something is true only if two conditions are true. For example, if I go to Walmart AND do my job then it is true that I get paid.

    Bitwise OR Operation

    0 OR 0 == 0
    0 OR 1 == 1	
    1 OR 0 == 1
    1 OR 1 == 1
    

    The bitwise OR operation can be thought of as something that is true if one or two conditions are true. For example, it is true that playing in the street will result in you dying because you got run over by a car. It is also true that if you live long enough, something else will kill you. Therefore, the bit of your impending death is always 1.

    Bitwise XOR Operation

    0 XOR 0 == 0
    0 XOR 1 == 1	
    1 XOR 0 == 1
    1 XOR 1 == 0
    

    The bitwise XOR operation is different because it isn’t really used much for evaluating true or false. Instead, it is commonly used to invert a bit. For example, if you go back to the source of my graphics programs in Chapter 2, you will see that most of those programs contain the statement:

    index^=1;

    If you look at my XOR chart above, you will see that using XOR of any bit with a 1 causes the result to be the opposite of the original bit. In the context of those programs, the index variable is meant to be 0 to represent black and 1 to represent white. The XOR operation is the quickest way to achieve this bit inversion. In fact, in all my years of programming, that’s pretty much the only thing I have used it for!

    Bitwise Left and Right Shift Operations

    Consider the case of the following 8 bit value:

    00001000

    This would of course represent the number 8 because a 1 is in the 8’s place value. We can left shift or right shift.

    00001000 ==  8 : is the original byte
    
    00010000 == 16 : after left shift
    00000100 ==  4 : after right shift
    

    Left and right shift operations allow us to multiply or divide a number by 2 by taking advantage of the base 2 system. These shifts are essential in graphics programming because sometimes to need to extract the red, green, or blue values separately out of their 24 bit representation. For example, consider this code:

       pixel=p[x+y*width];
       r=(pixel&0xFF0000)>>16;
       g=(pixel&0x00FF00)>>8;
       b=(pixel&0x0000FF);
    

    The first statement gets the pixel out of an array of data which is indexed by x and y geometric coordinates. This will be a 24 bit value, or in some cases 32 bit with the highest 8 bits representing the alpha or transparency level.

    variables r,g,b represent red, green, and blue. With clever use of bitwise AND operations and right shifting by the correct number of bits, it is possible to extract just that color component to be modified. Without the ability to do this, my graphics animations and my Tetris game would never have been possible. The colors had to be exactly sent to the drawing functions. This is true not just for SDL but using any graphical system involving colors.

    Learning More

    I know I covered a lot in this chapter but I encourage you to learn about the binary numeral system and its close cousin the hexadecimal system. If you do an online search, you will find courses, tutorials, and videos by millions of people who can probably explain these same concepts in a way that you understand better if you are still confused after reading this chapter!