Minilang - a simple parser in C. Part 2

I have refactored my code. As my first more serious C code it was quite ugly, broke DRY several times, naming was off and
overall code was in a big mess so I have dedicated half of my Sunday to write it again from scratch and I can call it a success.

What's new

Now it supports parentheses and solves them correctly. Handles whitespace without a problem and understands unary operators. So now we can solve 10*-2 to get -20. Unfortunately unary support is not complete yet as it only supports the case written above. In other cases it doesn't work because of a simple logic flaw.

Why unary doesn't work

In cases like -2+2 current code yields 4 instead of 0. Because operator precedence is: parentheses, multiplication/division, addition/subtraction, unary, 2+2 is matched and solved first, leaving AST tree as SUB ADD(2;2). In this case I am deciding what to do. To implement some additional template, or inject unary operator inside ADD node itself and handle it in solver. Or yet the most elegant solution of those - to implement parenthesis around the template, so in -2+2 cause parser would convert this equation into (-2)+2 yielding it to VAL(-2)+2.

Refactoring

I also got rid of checking if there's a dot in the expression as I do not care about integers or floats. I cast all numerics I have found into double datatypes and don't think about them twice. I have tried to do my best to give variables and functions meaningful names and the code is full of printf calls on each step of processing so to catch the bugs as early as possible. What I am ashamed of - memory leaks. There's not even one call of free and there should be.

Minilang can be found on GitHub.

Minilang - a simple parser in C

I developed a new goal - to learn C and C++ while implementing cool stuff. My first idea is to write a basic mathematical equations parser and solver to solve simple equations like 1 + 1 * 3 up to 5 + 8 * (9 - 4 * (10 * 5) + 7).

I have hosted minilang on GitHub.

Getting tokens

So firstly I have an input like a predefined string or from scanf() it doesn't matter. And I want to have tokens meaning 1 + 2 would have 3 tokens: 1, + and 2.

My idea is to build a linked-list like structure and have each token point to another. After this list is complete I should transform it into AST(Abstract Syntax Tree).

So lets begin to build. First I will need to have some structs. node_value is union struct where I can hold INTs or FLOATs in the same place without actually holding two fields.

1
2
3
4
5
6
7
8
9
10
11
12
13
union node_value {
	int iValue;
	float fValue;
};

struct ast {
	enum { VALUE, OPERATOR } node_type;
	enum { INT, FLOAT} value_type;
	enum { ADD, SUB, MUL, DIV} operator_type;
	short precedence;
	union node_value value;
	struct ast* next;
};

AST struct is maybe more complicated but it only shows that node can be VALUE or OPERATOR, it‘s type(INT or FLOAT) if it’s a value and operator type if it's an operator. Precedence is just for the future(as * and / are higher precedence than + or -). And I have next at the end to point to the next node.

Lets give it an input

char input[] = "123+456.78";
I am okay with this.

My ugly main function

I go through each character in input while detecting if current char is a digit or not. If it's not a digit I check to see if it was before - last_was_numeric. dot_was_detected shows that it's a float value so we can distinguish between integers and floats.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
int main(int argc, char** argv)
{
	printf("Lets process this: %s\n", input);

	struct ast * ast_head = (struct ast*)malloc(sizeof(struct ast));
	ast_head->next = NULL;

	// This will be set once numeric is detected. Then if dot is detected it will be known it's float
	int numeric_start_pos = -1;
	union node_value numeric;
	bool last_was_numeric = false;
	bool dot_was_detected = false;

	for (int i = 0, len = sizeof(input) - 1; i < len; i++) {
		printf("\nProcessing %c\n", input[i]);
		char val = input[i];		

		if (val >= '0' && val <= '9') {
			printf("Numeric detected at %d. Not sure if int or float yet\n", i);
			if (numeric_start_pos == -1 && !last_was_numeric) {
				numeric_start_pos = i;
				last_was_numeric = true;
			}			

		} else if (val == '+' || val == '-') {	printf("+ or - detected\n"); }
		else if (val == '*' && val == '/') { printf("* or / detected\n"); }
		else if (val == '.') { dot_was_detected = true;	}
		else { printf("Unsupported operator\n"); }

		// check for numeric token end
		if ( i+1 == len || ((val < '0' || val > '9') && val != '.' && numeric_start_pos != -1 && last_was_numeric)) {
			printf("Numeric ended at %d\n", i);

			char buffer[100];
			memset(buffer, '\0', sizeof(buffer));
			strncpy(buffer, (input + numeric_start_pos), i - numeric_start_pos+1);
			struct ast * new_node = (struct ast*)malloc(sizeof(struct ast));
			new_node->node_type = VALUE;
			new_node->next = NULL;

			if (dot_was_detected) {
				new_node->value.fValue = atoi(buffer);
				new_node->value_type = FLOAT;
				struct ast * curr = ast_head;

				// rewing head to the last element
				while (curr->next != NULL) {
					curr = curr->next;
				}
				curr->next = new_node;

				printf("Float translated: %f", atof(buffer));
			}
			else {				
				new_node->value.iValue = atoi(buffer);
				new_node->value_type = INT;
				struct ast * curr = ast_head;

				// rewing head to the last element
				while (curr->next != NULL) {					
					curr = curr->next;
				}
				curr->next = new_node;

				printf("Int translated: %d", atoi(buffer));
			}			

			dot_was_detected = false;
			numeric_start_pos = -1;
			last_was_numeric = false;
		}
	}

	print_ast(ast_head);
	getch();
	return 0;
}

In conclusion it does nothing except for building a linked list just from numeric nodes. Like in 123+456.78 case it simply builds a list of two nodes: [123]->[456.78].

What I can say from this ugliness - a lot of repetition and probably unneeded code. In the second post I will have it updated.

Emacs - Part 3

So I am here at part 3 improving my Emacs understanding each day while experimenting. I‘ve gotta say - there’s a lot of useful stuff I have never header before!

The bad

In Sublime Text I was used to two things very very very well and one thing quite a bit. Two being: mini map and tabbed interface, one being file tree explorer. My disappointment is that I have tried to use Tabbar but it really sucked as I immediately used Emacs in split-window mode and each of the windows have it's own tab header which is maybe great but really unnatural for me. I will forget this for a while until I can mod it on my own or just get used to buffer switching.

The middle-man

I have used speedbar which is average on file exploring(no tree structure) but has it‘s main disadvantage - it’s a separate window. Come on - who uses that? Even if I work with a single screen I am still not going to resize Emacs to windows mode just to have speedbar opened. Or maybe I shouldn't? Well.. - sr-speedbar to the rescue! Just M-x package-list install sr-speedbar and it is “Single Frame Speedbar” which is great. Except it's quite ugly.

The good

Oh man! I have mentioned skewer in part 2 and now I have played with it for a while. I didn‘t give it a go in real project but for experiments it’s amazing!. Let me tell you why if you don't know - it allows direct HTML/CSS/JS modifications in the real time. That sounds great but without an example these are just words! So example:
1. Install skewer with the usual M-x package-install
2. go into dired mode, hit + create “demo” folder and enter it
3. C-x C-f type in “demo.js” RET RET
4. 3rd step with “demo.html”

Now as my beginner emacs configuration may really suck I have manually enabled js2-mode and skewer-mode on my demo.js buffer and skewer-html on demo.html buffer.

Let the magic start! - M-x run-skewer. Bam a new binded browser window appears. That's cool. But sadly unusable.

Now as I have mentioned I am really sucky on Emacs so just made a node.js http server with connect library(to install - npm install connect):

1
2
3
4
var connect = require('connect');
connect.createServer(
    connect.static(__dirname)
).listen(8000);

Now - node server.js

Now skewer runs on 8080 port and our demo app is hosted on 8000 port. All good but now how do I bind my localhost:8000 to skewer? I have missed it at first but in readme there this javascript code:

1
(function(){var d=document;var s=d.createElement('script');s.src='http://localhost:8080/skewer';d.body.appendChild(s);})()

Man that's advanced…! This script is really sucky because it appears that if I press some key bindings at the wrong file or place it throws an error and that's all - I need to refresh the window manually and inject the script again. I will look for the error to appear again and will modify skewer js code to handle the situation painlessly.

As this code is injected that‘s all. Our files are binded to our demo.html page. Let’s see what we can do. In demo.html let's type in:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
<html>
  <head>
    <script type="text/javascript" src="demo.js"></script>
	<style>
	  #box {
	    position: absolute;
		width: 20px;
		height: 20px;
		background: blue;
	  }
	</style>
  </head>
  <body>
    <div id="box"></div>
	<button id="click">Click to move!</button>
  </body>
</html>

Lets just now hit C-M-x. What I have noticed that it must be done at once. Not Control then Alt then x, but Control and Alt and then x. This loads the html code to browser.

Now inside demo.js lets do some simple stuff:

1
2
3
4
5
var left = 0;
document.getElementById('click').onclick = function() {
  left += 2;
  document.getElementById('box').style.left = left + 'px';
}

Now again - C-M-x to load the code in the browser. Now if I click the button a few times I can see that it moves by 2 pixels to the right after each click. The magic begins when I go to the demo.js and change left += 2; into left += 20; and hit C-M-x. Now lets click and what do we see? We can see the box moving to the right by 20 pixels but the page was not refreshed! The page kept it's state, variables and other things in place!

Extra

As I have mentioned in part 2 - C-x C-e evaluates elisp normally and javascript in js2-mode so if I move my cursor to the end of left += 2; then hit C-x C-e and can see current value in the mini buffer. Yay! The same can be done for DOM elements where document.getElementById() appears!

Conclusion

I was really excited about skewer and really wanted to have it working. I simply don‘t know why or if it’s useful but during my javascript development I really got tired of having my page state reseted after each change. Sometimes it‘s just too slow for me to write some kind of automation because it’s just quicker and more efficient to hit F5 and manually fill in the details or adjust components to their desired state… Now with skewer in my hands I may do pretty things.

Emacs - Part 2: Skewer mode

Skewer

The second part went really quick(another day as you can see). I have found skewer-mode for emacs. I have found another - swank-js but skewer is made to be EASY to setup and so I tried. And so I enjoyed it.

Today is my second day using Emacs and I have full comfort in it. That just shows how little of functionality I have actually used from my editors in the past. Or maybe that Emacs is that good. So what‘s the skewer? It’s Javascript evaluator in the browser as I can say in my own words. Why is it cool? Well… BECAUSE I CAN EDIT MY JAVASCRIPT DIRECTLY AND CHANGES APPEAR AS IMMEDIATELY AS I WANT.

Let's look at the steps:
1. C-x C-f type in “demo.js” RET
2) in the new file I do M-x run-skewer
3) Then M-x skewer-mode
4) Now I type this code:

1
2
3
4
5
6
7
8
9
10
11
  var obj = {
	name: "test object",
	run: function() {
	  return 'I am running...';
	}
  }

  var a = 3;
  var x = Math.random();
  x;
  obj.run();

Now to evaluate it's the same as for elisp - C-x C-e. Now if I put my cursor on the x; at the end of it then C-x C-e and I see in the mini buffer - a random number - a result from Math.random()!. Now let's have some fun and put the cursor at the end of obj.run(); hit C-x C-e and what? An error? Ah yes! C-c C-k to load the whole source into the buffer then again retry this and it sure outputs “I am running…” in the mini buffer.

Unfortunately to log something out console.log() doesn‘t work. There’s a special skewer.log() for this but I really don‘t enjoy it. Now on the other side - there’s no reason an application should hold console.log() statements so I can live with that too. Note that once you ran M-x run-skewer it opened a new windows in a default browser and binded emacs to it. So for example if I type in alert('EMACS'); put cursor at the end and hit C-x C-e it sure fires an alert in a binded window on a browser.

Extra

There's a REPL which can be opened with M-x skewer-repl. I don't know how much this is useful as I am used to pressing C-shift-j or F12 and manually selecting Console to do my REPL stuff.

Emacs - Part 1

I have always had interest in some old bizarre editor like vim or emacs. A few years ago I started to notice that I really enjoy using keyboard instead of a mouse and so I tried to learn all most often used key combination in all the software I use daily.

I though that I am a typing machine because I use so much of my keyboard. Wrong. I compiled a list of my often used combinations and it is… small. Look:

Combination Description
C-a select all
C-c copy selected
C-x cut selected
C-v paste
C-z undo
C-z redo
C-f search
C-h search and replace
C-s save
C-d set multiple cursors
C-shift-s save as
shift-arrow move in direction while selecting. For example if I am at the top of the page and I do
C-arrow jump by words forward or backward
C-shift-arrow move block of selected text around
C-] and C-[ indent block forward/backwards
C-tab shift through the files forward
C-shift-tab shift backwards through the files

For example: now I press and hold PgUp key then hit Home key to get back to the start of a line then shift-down(5 times) and I have just successfully selected first 5 lines. Then I can copy them (C-c) and paste anywhere I want or to move them with C-shift-arrows.

As a great exercise I have turned this blog entry into my emacs exercise - I am writing this post from within emacs in markdown. On the very basic level I have really liked emacs after maybe two hours of experimenting and I have almost have all my key bindings in emacs. I did not redefine emacs commands so they would match the ones in the table. I am trying to use emacs as it is.

In emacs there are only two control keys - C(trl) and M(alt) by default. It really clicks me as I am used that these keys provide additional functionality for my keys. The commands are written in a form like C-x C-f which means press and hold Ctrl then x, release both, press and hold Ctrl then f, release. So here's the article with translations of my combinations to emacs combinations.

I have not put them in table as emacs has a bit different interface. For example in emacs way to select some text I go to the starting position and click C-space then I go around with arrow keys and once I want to copy what I have selected I do M-w.

Copy + Paste

To paste the text it's C-y. Copying and pasting was the fastest thing in emacs I got right from the start. I move my arrows then C-space then M-w then somewhere else it's C-y. If I want to cut the text - it's C-w instead of M-w. See the similarity? The same key but different control key and it's copy or cut.

Undo + Redo

To undo/redo I use C-/. This works like a normal undo however if I would like to redo I just click C-f and then each C-/ is a redo. C-f again and it‘s undo again. I believe this will be a little sucker for a while as I don’t have any clue about current state. While writing this post I have done it a few times and it really feels natural for me except the C-f part where I am not sure if I am undoing or redoing.

Search

To search within a current file I use C-s. This is not as I am used too because this only searched in down-direction only and doesn‘t accept a regex. Of course I don’t believe emacs does not have an easy way to fix this and doing C-h k meaning describe a keybinding for me gives me that pressing M-r gives me regular expression search. Yes it works good. It sucks a bit that some of keys like DEL have special meaning so it's not so easy to modify search so I just press C-g to cancel current command then hit C-s again and type in from the scratch. I don't want to burden myself with ALL the details about each command, I just want a way to do my daily combinations even if not so well but I really think this will be over in a few moments. On the other note - C-r turns on search backwards so I can live with that temporarily until I will learn enough Emacs LISP to modify them.

Search + Replace

Search without a replace is only 50% of functionality I need. Search and replace can be done with M-% which is M-shift-5 on my keyboard. I simply enter string to replace hit RET then enter string to replace with. No this will not immediately change everything but will go through each occurrence of entered string. n skips the replace for selected result and y or space replaces the occurrence. Of course sometimes I like simply ALL the occurrences changed then instead of y or space one should hit ! to replace all the occurrences.

Selections

Select all. I though that this will be available as some easy command but I managed to get around this by a quick combination of keys. Lets rinse and repeat:
1. C-home - goes to the very top and puts cursor at the first character on the first line of my current buffer.
2. C-space - turns on selection mode.
3. C-end - goes to the very bottom to the last character so all the content is selected.
4. M-w - copy all I have selected(all the content).
NEAT!
What‘s more neat that there’s a default keybinding for this - C-x h. I really though that I will have to remap these myself and I am happy that I will not have to do so.

Files

Saving files is simple. I noticed that a lot of IO commands start with C-x. For example if I want to open some file I do C-x C-f then type in the whole filename or just part of it and hit tab then I see possible files for my query. I can also navigate in the query through the folders. If I type in not existing file I simply just got it created. This doesn't feel all natural as if I close the file with C-x C-k(kill buffer) it doesn't get saved. C-x C-s does that as it forces to write the contents into the file I just created. If I am lazy enough to open search file query and create a new file from there I can hit C-x C-w and enter a filename in the query to have current buffer saved as that file. So now for example if I create a HTML file and would like to save it and work on it later(saving each modification) I do these steps:
1. C-x C-f
2. enter test.html
3. type in
4. C-x C-s
Done. Now we have test.html file with <html></html> as contents and it's currently opened in our emacs buffer.

Cursors

Multiple cursors for the win! Multiple cursors are simply an effect of rectangles. Rectangle commands can be reached with C-x r then often another letter is an operation. For example t - insert text, d - delete. Now rectangles can be quite tricky as from my short experience with emacs. For example if I have this text:

hello/world/file1.txt
hello/world/file2.txt
hello/world/file3.txt
hello/world/file4.txt
hello/world/file5.txt

and I would like to delete hello/world/ part or replace with another/path/. Lets do this:
1. go to the first hello, put cursor on h letter
2. C-space or simply hit shift
3. with arrows go to world on the last line, put cursor on / character
4. C-x r d to delete all the hello/world/ or C-x r t then type in another/path/ and hit enter

Vuola! This is not like multiple cursors where I was used to Sublimes C-d to set the cursor on another occurrence on currently selected word or search result however for a while I can live with rectangles or search and replace.

Buffers

Shifting through the buffers or files. Multiple windows. Now as I have opened a few files I have one big screen with this blog entry in it. I can switch through the buffers with C-x arrow where arrow is left/right arrow. So yeah C-x right right right left left left returns me back to where I am. Of course this is handy only when I have maybe 2 or 3 files opened and I want to switch between them quickly… But what if I have multiple buffers opened and just want to navigate through them? C-x b to the rescue! As I have ido mode enabled once I press C-x b I get a nice list of buffers and I can simply type part of the name of it and hit RET and here I am. So now I see that I have buffers: “Shell Command Output”, “Help”, “Info”, "emacs_part1.md", … I hit C-x b type in shel hit RET and I am in “Shell Command Output” buffer. While being there I hit C-x b type in “em pa” hit RET and I am inside this blog again.

Now what about windows? Hit C-x 2 and two windows appear with the same blog in it splitted horizontally. If I hit C-x 3 two windows appear side by side. Now I am currently in my first buffer and I want to display “Shell Command Output” in another. I hit C-x o to switch buffers and my cursor appears in the second window. C-x b type in “she” hit RET and the second window is filled with “Shell Command Output” contents in it. Now I can switch back to my blog with C-x o again. This is really nice! While I am on my blog, I would like to make it again single fullscreen so I hit C-x 1.

Extra!

Some extra before finishing first part! We can use ELISP anywhere in the emacs. What do I mean? If I type in here (* 1 2 3), then select it with shift and arrow then hit C-x C-e it evaluates the expression and shows the result in the bottom.

Conclusion

Moving selected lines of text around as I do easily in sublime is not that basic from my research so I leave it for another part.

What I suggest - first watch the parts on YouTube and then you can use this post as a reference. The video series go through the basics. Customization of emacs and common problems. All the parts weight maybe 40minutes in total so it's really a no brainer not to watch them.

1
2
(My current happiness levels because of emacs?)
> High!

Building functions in realtime

While solving a challenge on coding katas I came accross one more LISPy thing. The challenge was to construct a string from an expression(like(this())) // => "expression like this." note the dot at the end.

It is really similar to the one I solved some time ago where the solution should process this code:
one(plus(two(minus(three())))) // => 0
The solution was quite simple: I just returned numbers for each digit function(one, two, three…nine) and some processing for arithmetic operations.

In this case I built a wrapper:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
function build_functions_from_names(/* names */) {
  var names;
  if (typeof arguments[0] == 'object') {
    names = arguments[0];  
  } else {
    names = Array.prototype.slice.call(arguments);
  }
  
  names.forEach(function(name) {
    this[name] = function(partial) {
      partial = partial ? (' ' + partial) : '.';  // take care of a dot
      return name + partial; 
    };
  });  
}

Now of course it is bad, yes it is, because it pollutes global namespace. But now I just wanted to enjoy the damn thing and experiment. A solution for having a nice wrapper in this case handling custom namespaces would be wrap the wrapper inside self-executing function and pass in this like

1
var wrapper = (function(namespace) {...})(this);

but this causes a headache of having a nice way to pass in arguments. The simplest way would be to return a function making wrappers for passed namespace but that I consider a bloat in this fun case.

How does it produce a string? Simply:

1
2
build_functions_from_names('This is just a test'.split(' '));
console.log(This(is(just(a(test() )))) ); // => "This is just a test"

I really enjoy bending Javascript.

brainfuck

Brainfuck is a really easy language without no real purpose despite the fact it's turing complete. This challenge was a coding kata on Codewars. Please note referral url!

As a training exercise it is quite nice because it can be done in a very short period of time. One has to control flow with brackets so must keep track of them and overall fun to experiment in parser area.

The most difficult thing for me was to trace a bug which was inside brackets controller(loops). My handler extracted wrong brackets and the whole program misbehaved. How? Well when brackets are like [..[..[..]]] it is all good but when they are like [..[..]..[..]..] it is bad because I was matching opening brackets, closing brackets then reversed closing brackets and zipped both arrays together. Wrong way! The correct way is to keep track of current depth which I did by simply pushing opening bracket onto array and then poping it when closing bracket was found.

I maybe left one thing bad in this code - preallocated memory and initialized as 0. Preallocated memory is OK but preallocated zeroes are bad because at the end, the result is like 1,2,3,4,0,0,0...0,0 so I must filter that out. This thing is really easy to fix by simply avoiding preinitialization and checking for undefined in + and - controllers.

Also in the last few days I was reading Clean Code and made it a goal for my source code to be as readable as possible.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
function brainLuck(code, input) {
  var size = 512;       // size shows how much memory is preallocated for the program
  var limit = 100000;   // short-circuit on infinite loop
  
  // preallocate memory and initialize all with 0
  var pointer = size / 2;
  var data = new Array(size);
  for (var i = 0; i < data.length; i++) data[i] = 0;
  var output = [];    
  
  // instruction pointer
  var ip = 0;
  
  // loop_pairs will be "double-linked" list. Starting of a bracket will point to closing of it and vice-versa
  var loop_pairs = [];
  
  var opening_brackets = [];
  var closing_brackets = [];
  
  code.split('').forEach(function(char, idx) {
    if (char == '[') {      
      opening_brackets.push(idx);
    } else if (char == ']') {      
      var opening_bracket = opening_brackets.pop();
      loop_pairs[idx] = opening_bracket;
      loop_pairs[opening_bracket] = idx;
    }
  });
  
  // process commands
  var iteration = 0;
  while (ip < code.length) {
    iteration++;
    if (iteration >= limit) break;
    var opcode = code[ip];
    
    if (opcode == '[') { // loop start
      if (data[pointer] == 0) {
        ip = loop_pairs[ip];        
        continue;
      }
    } else if (opcode == ']') { // loop end
      if (data[pointer] != 0) {        
        ip = loop_pairs[ip];        
        continue;
      }
    }
    else if (opcode == '>') { pointer++ }
    else if (opcode == '<') { pointer-- }    
    else if (opcode == '+') { data[pointer]++; if (data[pointer] > 255) data[pointer] = 0; }
    else if (opcode == '-') { data[pointer]--; if (data[pointer] < 0) data[pointer] = 255; }
    
    else if (opcode == '.') {
      output.push(data[pointer]); 
    } else if (opcode == ',') { // handle input
      if (input.length == 0) {
        data[pointer] = '';
      } else {        
        // slice the first char of input and grab it's ascii code
        data[pointer] = input[0].charCodeAt(0);
        input = input.substring(1);
      }
    }        
    ip++; 
  }
   
  // glue the results back. Filter for non-zero values because of preallocated memory containing 0
  return output.filter(function(i) { return i > 0 }).map(function(i) { return String.fromCharCode(i) }).join('');
}

I am over with Scheme

I am done with Scheme and probably SICP I have written earlier. I was thinking that I am lazy or it is too difficult but the real reason why I gave up is because I find Scheme code simply not pleasant to write and especially - to read.

While SICP provides very good examples of data structures, algorithms I prefer another way - try to implement algorithms with F#. I have seen this beautiful mixed language earlier and really wanted to give it a shot. A better shot than I gave when I first learned about functional way.

I may find F# ugly or hard for me too but it's really only an advantage to see another language in action. So beware! My first F# post will be posted soon.

SICP Part 1

SICP - Structure and Interpretation of Computer Programs. This is a recommended watching/reading on a lot of sources like StackOverflow and other places of interned. I am curious to change my perspective a bit.

I have tried F#, Haskell for a while. Reading blogs, watching live coding sessions as well as reading code but it seemed that I do not build that functional thinking. So I decided to start from the very roots - SICP. Original SICP shows you all examples and teaches Scheme language. I chose DrRacket which contains Scheme implementation with useful addons and especially from the first sight - a debugger!

I just wanted to give it a try to see if it helps for me. I have tried watching original SICP videos but the problem is - I do not have patience watching a video so I chose reading an online book(google for pdf of course).

Syntax basics

Scheme syntax is really basic but seems to suck at first because of its notation.
Instead of

1
2
x = a + b + c
(1 + 2 + 3) / 3

you need to write

1
2
(define (x) (+ a b c))
(/ (+ 1 2 3) 3)

It really seems annoying but come on - it may be just a starting point to different point of view I thought :).

define - defines everything. Functions, variables, etc. The syntax is really clear from the example. For variable/function names you are allowed to use dashes or question marks which is awesome and great. I loved Ruby feature of having a question mark in method name so to be more readable like even? 3 => false.
conditional statements can be done with cond or if:

1
2
3
4
5
6
7
8
9
10
11
12
; absolute function logic
(cond 
  ((> x 1) 1)
  ((< x 0) 0)
  ((= x 0) 0)
))

; or using if
(if (< x 0)
  (- x)       ; if true
  x           ; if false
)

And Yeah - single line comments are defined with ; your comment.
Moving on the very first task comes to the view - implementation of Newton-Raphson method of square root algorithm. Basically it goes like this:
1. We have initial guess, like 1
2. We check if our guess is ok simply by doing abs((guess*guess) - x)) <= precision
3. Guess still sucks? Improve!
4. Improve algorithm: guess = ((x / guess ) + guess ) / 2

Just to check it's correctness I wrote it in JS of course

1
2
3
4
5
6
7
8
9
function sqrt(x, precision) {
  var guess = 1;
  while ( Math.abs((guess * guess) - x) > precision) {
    guess = ((x / guess) + guess) / 2;
  }
  return guess;
}

sqrt(2, 0.001); // => 1.4142156862745097

And here comes Scheme implementation:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
#lang racket

; our main loop is sqrt-iter with arguments: guess(initial) and x
(define (sqrt-iter guess x)
  (if (good-enough? guess x)
      guess                               ; Guess is good, return it
      (sqrt-iter (improve guess x) x)     ; Guess sucks - recursively iterate with improved guess
  )
)

; guess improval: ((x / guess ) + guess ) / 2
(define (improve guess x)
  (average guess (/ x guess))  
)

; Average looking function for average()
(define (average x y)
  (/ (+ x y) 2)  
)

; Check for guess precision. (guess^2 - value) < precision
(define (good-enough? guess x)
  (< (abs (- (square guess) x)) 0.001)
)

; Just a square function
(define (square x)
  (* x x)
)

; Finally declare the dacade of our mess - square function accepting x argument
(define (sqrt x)
  (sqrt-iter 1.0 x)  
)

(sqrt 1) ; => 1.0
(sqrt 2) ; => 1.4142156862745097
(sqrt 4) ; => 2.0000000929222947

Excercises

There are a lot of them. Some are just simple questions and some are tricky. For example this question was really a brain halt for me
for a few minutes straight. Look.

Alyssa P. Hacker doesn’t see why if needs to be provided as a special form. “Why can’t I just define it as an ordinary procedure in terms of cond?” she asks. Alyssa’s friend Eva Lu Ator claims this can indeed be done, and she defines a new version of if:

1
2
3
(define (new-if predicate then-clause else-clause)
  (cond (predicate then-clause)
    (else else-clause)))

Question is - what will happen if our code uses this new-if? Well probably nothing - it works great I though. No. If this is a question there must be something wrong and it is! new-if is not a special construct and so arguments passed to it are evaluated. That's the problem.

1
2
3
4
5
(define (sqrt-iter guess x)
  (new-if (good-enough? guess x)
    guess
    (sqrt-iter (improve guess x)
                x)))

In this case sqrt-iter part is launched too causing infinite loop.

Conclusion - interesting read. I have actually watched the first 2 parts some day long time ago, didn't remember anything about it except it seemed
like some kind of magic.

Project Euler 15

Project euler #15 at first seemed tricky. To generate all the paths available would be really bruteforce, unoptimized and ugly and not how these tasks are done. I grabbed a pencil
and drew a 2x2 box and 3x3 box. Inside 2x2 box I followed each border and the total path length was 4. In 3x3 the total length is 6. How do these numbers relate I though. Well simple - width+height. It can be easily imagined if I follow to the very right, and then to the very bottom. It doesn't make a difference if I make a turn somewhere in between.

Lets assume that we have size which describes the box size, because both height and width are the same. So 2*size gives us a total length needed to get the the finish. In 2 case its 2*2=4 in 3 case it's 2*3=6 etc..

The number of steps needed to not show only the steps, but thinking about it - it shows combinations. All the possible combinations somehow. Factorial to the rescue!

In our case - 20x20 box, we can go 20 times to the right and 20 times to the right. This adds up to a total of 40steps required to get to the finish. A total of 40! combinations. Initialy I though thats all. 40! is my answer but project euler website proved me wrong. My thinking was corrected when I redrew the box on the paper and saw that I MUST take 20 steps to the right and 20 steps to the bottom. I can not take 30 steps to the right and 10 to the bottom. So my available combinations size must be reduced.

Combinatorics provide us with a formula C(n, r) where n is the total of combinations to choose from and r is - how many to choose. n=40, r=20. Why r is 20. Well because it provides correct solution in our case and because OUT OF THOSE 40 RIGHTS + BOTTOMS we need to choose just 20 BOTTOMS or RIGHTS. Doesnt matter which one.

So the formula for solution : 40! / ((20!) * (40-20)!)

Out of the fun writing Javascript, this is the solution. A very big one indeed!

1
2
3
4
5
6
7
function fact(n) {
  var res = 1;
  for (var i = 1;i <= n; i++) res *= i;   
  return res;
}

console.log(fact(40) / (fact(20) * fact(20) ));