|
What is a linked list?
A linked list is a structure containing a list of objects. One object is denoted as the first. Each object then links to the next one in the list.
There are two main types of linked list: single and double. In a single linked list, each object links to the next. In a double linked list, each object links to both the next and the previous object.
Linked lists may be open or closed. If they are closed, the last object links back round to the first, and vice versa for a double linked list.
Wow! Linked lists are so great! I don't ever need to use an array again!
Wrong. The choice of whether to use a linked list or an array depends on what you're using it for. Let's compare the two:
|
TOPIC
|
Linked
Lists
|
Arrays
|
Reason
|
|
Speed
|
“Slow”
|
“Fast”
|
Searching for something in an array is
nothing more than going through a for loop, each time adding one to
a variable. In a linked list you have to dereference a pointer each time, which is a
more expensive operation.
|
|
Going to a specified location
|
“Slow”
|
“Very Fast”
|
If you want to go to e.g. the fifth
element in your list, you only need one operation on an array, but five
on a linked list. So if you know exactly where you have to be, an array is a
lot faster than a linked list.
|
|
Storing capacity
|
“Unlimited”
(theoretically)
|
“Limited”
|
When using an array you always need
to know the maximum number of elements you will have. With a linked list you
just add a new element when you need one.
|
|
Memory
|
“As much as needed”
|
“As much as you give”
|
Linked lists only use as much memory as
needed. If you only have one element, they will only use as much
memory as needed for
one element. When you have 1000 elements there will be only as much
memory as needed for 1000 elements. If you use an array you will
always use the same amount of memory; having one or 1000 elements makes
no difference to the size of the array.
|
|
Easy to use?
|
The more you use them the better it
will go.
|
Very easy
|
This differs from person to person.
|
OK, tell me how they work already!!!
First, some background on pointers. Even if you've used pointers before, you may find this section useful as a reminder or as clarification.
Think of a pointer as a code (an 'address') that tells the computer where the actual object is; it is not the object itself. For simplicity, we shall consider a pointer to an int first.
int variable = 5;
int *pointer;
We indicate that 'pointer' is a pointer instead of a normal variable by using the asterisk (*). At the moment, 'pointer' is a hanging pointer - it does not point to anything. Before we can use this pointer, we must assign it.
pointer = &variable;
It now points to 'variable'. The ampersand (&) before 'variable' means 'address of variable'. This address is assigned to 'pointer', so now 'pointer' points to the variable.
printf("The pointer points to the number %d.\n", *pointer);
The asterisk (*) dereferences the pointer - it gets the number to which 'pointer' points. This statement displays it on the screen. Make sure you understand the difference between this asterisk and the one used when declaring the variable.
(*pointer)++;
Be careful about this. The ++ and -- operators have higher precedence than the dereferencing operator, so without the brackets, this would change the value of 'pointer'. You can only do this safely with pointers when they point to arrays or character strings. However, with brackets as it is written, it will change the value of 'variable'.
printf("The variable contains the number %d.\n", variable);
This will display "The variable contains the number 6."
Now we'll consider a pointer to a struct. We'll use the RGB struct from Allegro.
RGB rgb;
rgb.r=63;
rgb.g=32;
rgb.b=16;
The dot (.) is used to refer to elements in the struct. But what if we are using a pointer to the struct?
RGB *rgb_pointer = &rgb;
(*rgb_pointer).g = 48;
First we must dereference the struct. Then we can refer to a specific element. However, the notation is a little tedious, so we use the following shortcut:
rgb_pointer->g = 48;
The . and -> operators have higher precedence than * and &. Therefore:
&rgb.r is a pointer to the 'r' field.
(&rgb)->r (which is equivalent to rgb.r) cannot be written as &rgb->r.
Wow, I know how pointers work! Now I can go and finish that other game...
Yes, get some practice with pointers. When you're ready, come back here and read on. Now I'll explain how to create a linked list.
The first thing you do is create a struct. This can contain anything you like. In the spirit of this edition of Pixelate, we shall use an example that may be useful in a platform game.
typedef struct ENEMY {
struct ENEMY *next;
int x,y;
int direction;
BITMAP *sprite;
} ENEMY;
First let's make sure we understand the syntax. First we create a struct called 'struct ENEMY'. Then we place this definition in a 'typedef' statement, which will enable us to refer to the struct later simply by 'ENEMY'.
The key to the linked list is the 'next' field. This is a pointer to another object of the same type. This pointer will be used to 'link' to the next object. If this is the last object, we set the pointer to NULL.
Notice that the typedef does not apply within the struct itself; the 'struct' keyword is necessary.
Now we must create a pointer to the first object. That's right, a pointer - the linked list could be empty.
ENEMY *first_enemy = NULL;
Notice that we are starting with an empty list.
Adding an object to the list
The following function will create an enemy.
ENEMY *add_enemy() {
ENEMY *enemy = malloc(sizeof(ENEMY));
enemy->next = first_enemy;
first_enemy = enemy;
return enemy;
}
The malloc() function allocates some memory and returns a pointer to it. sizeof(ENEMY) indicates how much memory the enemy needs. In order to use malloc(), you will have to #include <stdio.h>.
Then we set the enemy's 'next' field to point to what was the first enemy in the list (if there was one), and this enemy becomes the new first enemy. Finally the function returns a pointer to the enemy so that you can fill in the other fields. Here is an example of how to use the function:
ENEMY *enemy = add_enemy();
enemy->x = random() % 320;
enemy->y = random() % 200;
enemy->direction = random() & 1;
You may ask, why do we need to return a pointer to the enemy when we can just use first_enemy? This is a question of modular design; we shall come back to this later.
Scanning through the objects
You will probably want to do something with the objects. Perhaps you want to change their positions; perhaps you want to draw them. You can use the following algorithm to scan through the list:
ENEMY *enemy = first_enemy;
while (enemy) {
do_what_you_will_with_the(enemy);
enemy = enemy->next;
}
This will start with first_object. Provided it is not NULL (so the list is not empty), it will execute the loop once on that enemy. Then it will proceed on to the next one. Provided this next one exists, it will execute the loop again.
Removing an object from the list
When scanning through the objects, you may want to remove one or two along the way. If so, you will have to use a slightly different version of the loop:
ENEMY **enemy_p = &first_enemy;
while (*enemy_p) {
ENEMY *enemy = *enemy_p;
if (process_enemy_and_return_1_if_we_want_to_destroy_the(enemy)) {
*enemy_p = enemy->next;
free(enemy);
} else {
enemy_p = &enemy->next;
}
}
This is complicated, so pay attention. The first thing we do is declare a pointer to [a pointer to an enemy]. That's not a misprint; this is really a pointer to a pointer. We call it enemy_p. Make sure you understand the nature of this variable. The reason for this should become clearer as we go on.
First we set enemy_p to point to the first enemy's pointer. When we test *enemy_p, we are really testing the value of first_enemy - so we just check that the list is not empty before proceeding.
Then we create a new variable, 'enemy'. This one simply points to the enemy. However, notice the difference. If we write 'enemy = NULL', then first_enemy does not change. However, if we write '*enemy_p = NULL', then first_enemy does change. This is the reason for using the pointer to a pointer.
The function will return 1 if we want to destroy the enemy. Let's have a closer look at what happens next.
*enemy_p = enemy->next;
free(enemy);
Here we are effectively changing the value of first_enemy. We set it to point to the next one. Then we free the memory previously used by the enemy. Gone. It's as simple as that. But what happens if we do not want to get rid of the enemy?
enemy_p = &enemy->next;
We are not changing first_enemy here. We are changing enemy_p. It will no longer point to first_enemy. Now it will point to [the pointer to the next enemy in the list], which is first_enemy->next. Next time round the loop, if we want to get rid of the second object, we will be changing first_enemy->next.
This continues until we reach the end of the list.
If you have trouble understanding this method, then just use it blindly. Provided you stick to the algorithm as quoted above, and write the function properly, you should be safe.
WARNING: You cannot add objects using this second version of the algorithm. If you need to add and remove objects in the same loop, then use the first algorithm and add a 'die' field to the struct. Set an enemy's 'die' value to 1 if you want to get rid of the poor thing. You can then use the second algorithm to scan the objects and remove those whose 'die' values are set.
Destroying all the objects at once
At the end of your program you will want to get rid of all the objects at once. If you don't, then go and work for Microsoft.
Use the following algorithm:
ENEMY *enemy = first_enemy;
first_enemy = NULL;
while (enemy) {
ENEMY *next = enemy->next;
free(enemy);
enemy = next;
}
Deciphering this algorithm is left as an exercise for the reader. However there is one thing to note. Think about what would happen if we did this:
free(enemy);
enemy = enemy->next;
This is WRONG, as enemy->next is stored in the struct. Therefore we must read it before we free the memory. We can do this by using an intermediate variable 'next' to store the address of the next element.
Is that it? I could have figured that out by myself, you know...
Good. Now get to work!
But wait, there's more, if you want it.
Sorting a linked list
Whaa? OK. Suppose you want to sort your objects from left to right, using their x-values. Don't ask me why, but that's what you want to do. First it would help if you added new enemies in the right place.
Here's a modified version of the add_enemy() function:
ENEMY *add_enemy(int x) {
ENEMY *new_enemy = malloc(sizeof(ENEMY));
ENEMY **enemy_p = &first_enemy;
while (*enemy_p && (*enemy_p)->x < x) enemy_p = &(*enemy_p)->next;
new_enemy->next = *enemy_p;
*enemy_p = new_enemy;
new_enemy->x = x;
return new_enemy;
}
Look carefully at this. We cycle through the elements until we find one whose x position is less than that of the new enemy. Again we are using a pointer to a pointer, so that we can insert our object when the time comes.
Notice that we are no longer necessarily returning first_enemy. That careful decision earlier paid off. However, we have added a parameter. OK, so we didn't think of everything when we designed the interface; but adding the parameter to all the calls is easy enough.
Now what if the objects move? We will need to sort all the enemies already in the linked list.
void sort_enemies() {
ENEMY *old_enemy = first_enemy;
first_enemy = NULL;
while (old_enemy) {
ENEMY **enemy_p = &first_enemy;
while (*enemy_p && (*enemy_p)->x < old_enemy->x) enemy_p = &(*enemy_p)->next;
{
ENEMY *next = old_enemy->next;
old_enemy->next = *enemy_p;
*enemy_p = old_enemy;
old_enemy = next;
}
}
}
First this moves the list into a temporary variable called old_enemy, and clears the original list. The list has not really moved, but we are storing its pointer in a different place.
Then we move the objects one by one into the original list. The code to do this is borrowed from add_enemy(). The difference is that we also advance old_enemy to point to the next object.
This may not be the most efficient sorting algorithm, but the usual methods are difficult to implement on linked lists. This is because it is impossible to address individual elements. Here you should consider whether your program will work better with an array, as the qsort() function would probably be quicker than this one.
Other Types of Linked List
If you need to traverse a list backwards as well as forwards, a double linked list might be useful. You would then create the struct as follows:
typedef struct ENEMY {
struct ENEMY *prev;
struct ENEMY *next;
int x,y;
int direction;
BITMAP *sprite;
} ENEMY;
To create a closed linked list, you would have to alter the algorithms for creating and scanning them. In particular you would have to make sure you stopped after one journey round the list.
I shall not explain how to use such lists here. For example code, check the Allegro source code, specifically src/poly3d.c. When drawing 3D polygons, a closed double linked list is used for the edges of the polygon. The routine then finds the highest point on the polygon and traverses it in both directions for the two sides of the polygon. It stops when the two pointers meet up again at the lowest point.
No one likes a Hardcoder
Our linked list works quite well now. We can add stuff, remove stuff, and even free them
if we need to. However, our "stuff" is currently limited to one thing.
This is all well and fine in many cases, but for the sake of this article, we will
try something more complex. As you all know, making specific implementations can be
generally bad when you could create a more general implementation instead.
What we're aiming at is creating some kind of linked list that can store
any kind of data. "How do we do this?" you might ask. Well, then, I have two simple
words for you: void pointer.
A void pointer is, for those of you who didn't know, a pointer that can point to anything -
be it an integer, an array or even a struct. We can put this to some good use in
creating our universal linked list engine!
typedef struct _node
{
void *data;
struct _node *next;
} node;
This is the basic struct we will use for our linked list. As you can see, it has all
the necessary stuff for making a linked list, as well as a void pointer we can use to
point to the data we just allocated. When you want to add something, allocate some space
for a node, set *data to point to what you want it to point to, and add it to our list.
Since our linked list now points to node structs rather than enemies, we need to change
things around a bit.
Here are our new functions. Note that the algorithms are fundamentally different from the above. In particular, we use comparisons with 'first' as an alternative to using the pointer to a pointer. Use whichever method you prefer.
void add_node(node *add)
{
if (!add)
return;
add->next = first;
first = add;
}
void remove_node(node *del)
{
node *tmp;
if (!del)
return;
if (del == first)
first = del->next;
else
{
for (tmp=first;tmp && tmp->next != del;tmp=tmp->next);
if (tmp)
tmp->next = del->next;
}
}
About now, some of you may be screaming out in pain, thinking, "Hey, wait a minute - if our
system is so universal, how come we're still hanging on to a globally defined *first pointer?"
That's a very good question, and as a matter of fact, we won't for long. If your program
only ever uses one linked list, this won't be a problem. But say you want two different lists
for different purposes - having them share one *first node is not a good thing.
So, what we want is to be able to define our own pointer and have the functions behave
accordingly with this arbitrary first pointer.
node *add_node(node *first, node *add)
{
if (!add)
return first;
add->next = first;
return add;
}
node *remove_node(node *first, node *del)
{
node *tmp;
if (!del || !first)
return first;
if (del == first)
first = del->next;
else
{
for (tmp=first;tmp && tmp->next != del;tmp=tmp->next);
if (tmp)
tmp->next = del->next;
}
return first;
}
Things have changed! The change the easiest to understand must be that we now pass
a first pointer to our functions, allowing them to work properly with different first
elements. Our functions now use this pointer instead of the globally defined one.
The other change is that our functions now return a node pointer.
The reason for this is that our *first pointer may change inside one of these,
for example if we add an element to the front of our list, or if we remove the first one.
Just changing it won't do any good - the *first pointer passed to our
functions lies on the stack, and is basically just a copy of the original *first
pointer. So, we need a way to change the actual pointer to the first element.
Thus, all these functions return a new *first pointer, if necessary. So, to use
these new functions, we could do something like this:
node *first_one;
node *first_two;
void foobar()
{
node *tmp_one, *tmp_two;
first_one = add_node(first_one, tmp_one); // will return tmp_one
first_two = add_node(first_two, tmp_two); // will return tmp_two
first_one = remove_node(first_one, tmp_one); // will return NULL
first_one = remove_node(first_one, tmp_two); /* will do nothing, since tmp_two is not in this
list, and return first_one (which is NULL) */
}
This demonstrates several things. As you can see, we have two individual lists, each
with its own *first pointer that points to the beginning of the list. We can add and
remove nodes from each of them individually, keeping two separate lists for different
purposes. Since our functions return the new *first value for our list, we need to set
the *first value to whatever we get returned from the function, as shown above.
All right! We now have a working universal linked list that can provide us with any
data needed! Rock on!
There are several things that could be improved here, for example you might want to
create a function that allocates space AND adds the node to the list in one go, one
that frees and removes nodes, and perhaps even one that clears and frees the entire
list? It's up to you - with a bit of brain activity you can no doubt figure out a
way to do it. But we're not going to cover it here.
One thing you have to remember when working with linked lists is that data are always
allocated at runtime, and MUST be freed, or you WILL end up with memory leaks! So
keep a close eye on those pointers!
Kicking it up a notch
Many people out there use C++ rather than old musky C, and some of the things we've
learned today become obsolete when using C++. Owing to the object-oriented nature of
C++, we can do some pretty nifty things. Also, having constructors and destructors
to do all the hard stuff for us, our linked list becomes a pleasure to work with,
since you no longer have to modify everything manually!
Consider this piece of simple code:
class Node
{
private:
void Add(Node *); // Adds this node to the list starting at argument
void Remove(); // Removes this node from the list
void SetFirst(Node *);
// Sets the First element of this and all succeeding nodes to argument, so
// make sure to call it from the real first element of the list!
public:
void *Data;
Node *First; // First node in the list
Node *Next; // Next node in the list
// Constructor, with default First node being NULL, as in create new list
Node(void *, Node * = NULL);
// Destructor - removes this node from the list
~Node();
void Clean(); // Destroys the entire list
void Nuke(); // Destroys list and frees data
};
Node::Node(void *p, Node *n)
{
Data = p;
Add(n); // Add this to the list
}
Node::~Node()
{
Remove(); // Remove from the list
}
void Node::Add(Node *n)
{
if (n)
Next = n->First; // Set the next element to what was first
else
Next = NULL; // No starting node - set to NULL.
SetFirst(this); // Make sure all nodes know that this is the first element now
}
void Node::Remove()
{
Node *tmp;
if (First == this) // If this is the first node
SetFirst(Next); // just set the first node to the next one instead
else
{
for(tmp=First;tmp && tmp->Next != this;tmp=tmp->Next); // Find previous node
if (tmp) // If we actually found it, which we should have
{
tmp->Next = Next; // Set the previous one's next to this one's next, to close the loop
}
}
First = NULL; // To be sure, reset First and Next.
Next = NULL;
}
void Node::SetFirst(Node *n)
{
First = n; // Set this node's First to n
if (Next) // If there's another node...
Next->SetFirst(n); // Set that one's First to n as well
}
void Node::Clean()
{
Node *tmp, *r = First; // Start at the first node of this list, for better speed
while (r) // While r still means something
{
tmp = r->Next; // Save our next pointer
delete r; // Remove the one we're at
r = tmp; // And advance.
}
}
void Node::Nuke()
{
Node *r = First;
/* Basically does the same as Clean, except it frees up the data and then calls Clean to do the freeing of the nodes themselves. */
while (r)
{
delete r->Data;
r = r->Next;
}
Clean();
}
That's quite a bit of code, isn't it? Because of that, I'm not going to go into a lot of
detail about it - look at it and try it out yourself.
Basically, it's an improved C++ version of our previous linked list, based around the Node class
we just created. This Node class contains all the functions we need to add and remove nodes
from the list, as well as one to clear out the entire list. Now, we don't need to do all
the stuff we did before - much of it is done from within the class itself. Also, we now
store the First pointer inside the class, so that all nodes know what node is first in the
list. This saves us a lot of trouble passing pointers back and forth, and lets us retrieve
the first element in the list from a pointer to any node in the list, allowing us to be
less strict on the "always keep the first pointer" bit.
To use this list, first create a starter node, by doing:
Node *f;
f = new Node(data);
This will create a single element with a First value pointing to itself and a NULL Next value
(the second argument is defaulted to NULL, so if nothing is passed, it will assume NULL),
meaning a single, lonely node, with the Data pointer pointing to whatever you passed as the
first argument.
Remember to keep the value returned - you will always need at least one pointer to a node
in the list, or else we're back to memory leaks again. Just adding more and more nodes like
this will create tons of individual lists with only one node each.
When we later want to add more nodes to our list, we do this:
new Node(data, f);
And it will be added to our list, ready to use. The list will automatically update itself
with the proper First values, thanks to our SetFirst(Node*) function. Beware though -
whatever f was previously pointing to is, because of prepending, no longer the actual first
node in the list! This isn't a problem though - when adding to a list, the node will be
added to the real beginning of the list regardless of what node you pass as starting node.
When you're done adding new nodes, simply set f = f->First, and it will be all right again -
another pro of storing the First value!
Later on, you can delete any node by simply doing:
delete your_node;
And it will be removed and freed, without you having to do any manual unlinking of the node,
thanks to the destructor.
Finally, there's the Clean() function, which lets you delete the entire list. This one
can also be called from any node, and will remove and free the entire list. Note that it
does not free up what Data is pointing to, since that is not always desired - hence the
Nuke() function that deletes the entire list and frees up all the data in it. Because
we start from the first element and always delete the first element of the list,
the Remove() function doesn't have to go through the entire list over and over again
to find its predecessor, as it would have to do if we went the other way around,
starting at the end of the list. This could perhaps need a bit of improving on though -
Remove() still goes through the list updating the First pointer, which isn't really
necessary since we won't use it anymore anyway!
(Note: This class could be expanded and improved a ton, for example by allowing you
to add a node anywhere in the list, sorting it, searching through it, using iteration
pointers, letting you traverse it more easily using overloaded operators, et cetera -
but that isn't the issue of this article, and this one is more like the one we did
before in regular C.)
Conclusion
Whew, that
was quite a lot about linked lists. We hope you learned something. If you
have any questions then don't hesitate to drop any one of us a line.
Ben Davis
X-G
Dy_Anne
|