Implement "Tail -F" in C++

How would you implement tail efficiently?

I don't think there are solutions different than "keep the latest N lines while reading forward the data" or "start from the end and go backwards until you read the Nth line".

The point is that you'd use one or the another based on the context.

The "go to the end and go backwards" is better when tail accesses a random access file, or when the data is small enough to be put on memory.
In this case the runtime is minimized, since you scan the data that has to be outputted (so, it's "optimal")

Your solution (keep the N latest lines) is better when tail is fed with a pipeline or when the data is huge.
In this case, the other solution wastes too much memory, so it is not practical and, in the case the source is slower than tail (which is probable) scanning all the file doesn't matter that much.

How to add and use tail in C list

Corrected code

struct list
{
int data;
struct list* next; <<<<==== you had lista here - surpised it even compiled
};
struct listWsk
{
struct list* head, * tail;
};
int create(struct listWsk* list, int data)
{
// you dont need 2 nodes here just one
// head and tail both point to it
struct list* new_front = malloc(sizeof(struct list));
if (NULL != new_front)
{
new_front->data = data;
new_front->next = NULL;
list->head = new_front;
list->tail = new_front;
return 1;
}
return 0; <<< === you were not returning a value here
};
void print(struct listWsk list)
{
while (list.head != NULL)
{
printf("%d ", list.head->data);
list.head = list.head->next;
}
}
void front(struct listWsk* list, int data)
{
struct list* new = malloc(sizeof(struct list));
if (new != NULL)
{
new->data = data;
new->next = list->head;
list->head = new;
}
};
void back(struct listWsk* list, int data)
{
struct list* new = malloc(sizeof(struct list));
if (new != NULL)
{
new->data = data;
new->next = NULL;
list->tail->next = new;
list->tail = new;

}

}

int main()
{
struct listWsk lista = { NULL,NULL };
create(&lista, 5);
front(&lista, 8);
back(&lista, 3);
// I added a few more test cases
front(&lista, 81);
back(&lista, 32);
print(lista);

}

you problem was that you had 2 nodes instead of one in you list with 1 entry, things just got confused

wrong output with my own implementation of the linux tail function

It seems you are reading bytes bytes into tail and passing tail to %s.

This will lead to undefined behavior because %s requires a pointer to a string (null-terminated sequence of characters) while tail won't contain any terminating null-character, and therefore it will read out-of-bounds, seeking for terminating null-character.

To overcome this issue, you can specify length to print.

Try printf("%.*s\n", bytes, tail); instead of printf("%s\n", tail);.

Tail command- using fseek() and getline() - Output out of order

You can either:

  • Keep the last N lines in a buffer and go forward, OR

  • Start from end of the file and go backward N lines.

The second approach is more efficient considering arbitrary file sizes. So your approach is fine and the more efficient one. However, your code needs work.

In your code, you may want to check for the case where there are less than k lines in the file. Take a look at the tail implementation in GNU here for more.

1) Memory leak: The memory allocated with getline needs to be freed at the end:

if(line)
free(line)

2) When you hit a '\n' (your if becomes true) you should still go backward first (think about what happens with the last '\n' in the file, that's the first '\n' that you hope to hit). However, your code performs a getline which is going forward.

3) Your else does not seek a single position, it depends on the value of previous strlen(line) and not the current line, and you might miss some '\n' by this seek OR your seek might be illegal --> make sure that you don't hit the pass the start of the file while going backward.

What are some good ways of implementing tail call elimination?

Simpler than writing a compiler and VM is to registerize and trampoline your interpreter. Since you have an interpreter and not a compiler (I assume), you only need a couple straightforward transformations to get proper support for tail calls.

You'll have to first write everything in continuation-passing style, which may be weird to think about and do in C/C++. Dan Friedman's ParentheC tutorial steps you through transforming a high-level, recursive program into a form that is machine-translatable to C.

In the end, you'll essentially implement a simple VM where instead of using regular function calls to do eval, applyProc, etc., you pass arguments by setting global variables and then do a goto to the next argument (or use a top-level loop and program counter)...

return applyProc(rator, rand)

becomes

reg_rator = rator
reg_rand = rand
reg_pc = applyProc
return

That is, all of your functions that normally call each other recursively are reduced to a pseudo-assembly in which they are just blocks of code that don't recur. An top-level loop controls the program:

for(;;) {
switch(reg_pc) {
case EVAL:
eval();
break;
case APPLY_PROC:
applyProc();
break;
...
}
}

Edit: I went through the same process for my hobby Scheme interpreter, written in JavaScript. I took advantage of a lot of anonymous procedures, but it might help as a concrete reference. Look at FoxScheme's commit history starting from 2011-03-13 (30707a0432563ce1632a) up through 2011-03-15 (5dd3b521dac582507086).

Edit^2: Non-tail recursion will still consume memory, even if it's not in the stack.

Is there a data structure for implementing a function equivalent to 'tail -n' command in C++?

One of the things that is slowing down your program is reading the file twice, so you could keep the last n EOL positions (n=10 in your program) and the most convenient data structure is a circular buffer but this isn't provided by the standard library as far as I know (boost has one). It can be implemented by an std::vector with size n, with an index where a modulo of n is done after incrementing.

With that circular buffer, you can jump immediately to the lowest offset (next one if buffer is full) in the file and print the needed lines.



Related Topics



Leave a reply



Submit