Simple Glob in C++ on Unix System

Simple glob in C++ on unix system?

I have that in my gist. I created a stl wrapper around glob so that it returns vector of string and take care of freeing glob result. Not exactly very efficient but this code is a little more readable and some would say easier to use.

#include <glob.h> // glob(), globfree()
#include <string.h> // memset()
#include <vector>
#include <stdexcept>
#include <string>
#include <sstream>

std::vector<std::string> glob(const std::string& pattern) {
using namespace std;

// glob struct resides on the stack
glob_t glob_result;
memset(&glob_result, 0, sizeof(glob_result));

// do the glob operation
int return_value = glob(pattern.c_str(), GLOB_TILDE, NULL, &glob_result);
if(return_value != 0) {
globfree(&glob_result);
stringstream ss;
ss << "glob() failed with return_value " << return_value << endl;
throw std::runtime_error(ss.str());
}

// collect all the filenames into a std::list<std::string>
vector<string> filenames;
for(size_t i = 0; i < glob_result.gl_pathc; ++i) {
filenames.push_back(string(glob_result.gl_pathv[i]));
}

// cleanup
globfree(&glob_result);

// done
return filenames;
}

List files in directories using Glob() in C

Use nftw() instead of glob() if you want to examine entire trees, rather than one specific path and filename pattern.

(It is absolutely silly to reinvent the wheel by going at it using opendir()/readdir()/closedir(), especially because nftw() should handle filesystem changes gracefully, whereas self-spun tree walking code usually ignores all the hard stuff, and only works in optimal conditions on your own machine, failing in spectacular and wonderful ways elsewhere.)

In the filter function, use fnmatch() to decide whether the file name is acceptable using glob patterns.

If you wish to filter using regular expressions instead, use regcomp() to compile the pattern(s) before calling nftw(), then regexec() in your filter function. (Regular expressions are more powerful than glob patterns, and they are compiled to a tight state machine, so they are quite efficient, too.)

If you are unsure about the difference, the Wikipedia articles on glob patterns and regular expressions are very useful and informative.

All of the above are defined in POSIX.1-2008, so they are portable across all POSIX-y operating systems.

glob pattern matching in .NET

I found the actual code for you:

Regex.Escape( wildcardExpression ).Replace( @"\*", ".*" ).Replace( @"\?", "." );

ls command runnable in terminal not runnable in C++

Wildcards like * are evaluated by the shell, so you'll have to invoke the shell directly to if you want it to process something for you.

For example, calling /bin/sh -c "ls /home/aidan/Pictures/Wallpapers/*/*.{jpg,JPG,png,PNG}" instead of ls /home/aidan/Pictures/Wallpapers/*/*.{jpg,JPG,png,PNG} will work. There is also a system call called system() that invokes a given command in the default shell for you.

However, using the shell to do the globbing is very dangerous if you pass untrusted user input to the shell. So try listing all the files and then using a native globbing solution to filter them instead of shell expansions.

list files in a directory - and filter results

For filtering - or globbing - I assume I can manually do this in my code

Do not reinvent the wheel - glob and fnmatch and wordexp are standard.

How can I list all "*.txt"

The expansion of *.txt globbing is done by your shell as part of filename expansion and are spitted into separate arguments before executing the ls command. Expanding globbing expressions is not part of ls. Supporting globbing in ls would be a non-standard extension.

To list files in a directory, use scandir (with alphasort). A perfect example of scandir is in linux man pages scandir:

   #define _DEFAULT_SOURCE
#include <dirent.h>
#include <stdio.h>
#include <stdlib.h>

int
main(void)
{
struct dirent **namelist;
int n;

n = scandir(".", &namelist, NULL, alphasort);
if (n == -1) {
perror("scandir");
exit(EXIT_FAILURE);
}

while (n--) {
printf("%s\n", namelist[n]->d_name);
free(namelist[n]);
}
free(namelist);

exit(EXIT_SUCCESS);
}

A ls program would iterate for each argument, see if it's a file - if it is, list it with permissions (one call to stat + formatting). If an argument is a dir - it would use scandir then stat for each file + formatting. Note that the output format of ls is actually specified in some cases (but still leaves much freedom to implementations).

How can I list all "*.txt" files in a directory in good old C?

There's another example in man-pages glob. The code would be just simple:

       glob_t globbuf = {0};
glob("*.txt", 0, NULL, &globbuf);
for (char **file = globbuf.gl_pathv; *file != NULL; ++file) {
printf("%s\n", *file);
}
globfree(&globbuf);

The POSIX links and linux-man pages project are great for studing - linux-man pages has that SEE ALSO section below that is great for discovering functions. Also see GNU manual.



Related Topics



Leave a reply



Submit