Regex for Finding File Paths

regex for finding file paths

Use regex(\/.*?\.[\w:]+) to make regex non-greedy. If you want to find multiple matches in the same line, you can use re.findall().

Update:
Using this code and the example provided, I get:

import re
re.findall(r'(\/.*?\.[\w:]+)', "file path /log/file.txt some lines /log/var/file2.txt")
['/log/file.txt', '/log/var/file2.txt']

Regex for parsing directory and filename

Try this:

^(.+)\/([^\/]+)$

EDIT: escaped the forward slash to prevent problems when copy/pasting the Regex

RegEx to find Windows file paths inside of text

You may use this regex to capture folder and filename in 2 separate capture groups:

(?:\\\\[^\\]+|[a-zA-Z]:)((?:\\[^\\]+)+\\)?([^<>:]*)

RegEx Demo

RegEx Details:

  • (?:\\\\[^\\]+|[a-zA-Z]:): Match either a server name or IP address that starts with \\ followed by 1+ non-\ characters OR a drive letter followed by a : in a non-capturing group
  • ((?:\\[^\\]+)+\\)?: 1st capture group for folder path that matches a string starting with a \ and matches 1+ non-\ characters allowing multiple occurrences of that followed by a \. This group is optional due to presence of ? in the end.
  • ([^<>:]*): Match filename that 0 or more of any character that is not <, > and :

Regular expression to match a file path with certain prefix

You don't need groups and lookaheads - it's only regex match in mongo. The query could be as simple as

db.collection.find({fieldname:/^\/abcd\/[^\/]+$/})

Regex to find directory in text

You can use a regex like this:

\/.*\.[\w:]+

Working demo

enter image description here

Btw, if you want to allow backslashes in the path you can have:

[\\\/].*\.[\w:]+

Regex for extracting part of a file path

If here we wish to capture the /, then we might just want to try ([\/]+). There should be other expressions to extract one also, such as:

(?:\/[a-z]+\/)(.+?)(?:\/.+)

and our code might look like:

regexp_extract(filepath, '(?:\/[a-z]+\/)(.+?)(?:\/.+)', 2)

or

regexp_extract(filepath, '(?:\/.+?\/)(.+?)(?:\/.+)', 2)

Compartments

In this case, we are not capturing what is behind one using a non-capturing group:

(?:\/[a-z]+\/)

then we capture one using:

(.+?)

and finally we add a right boundary after one in another non-capturing group:

(?:\/.+)

RegEx Circuit

jex.im visualizes regular expressions:

enter image description here

DEMO

Depending on which slash, one might be located, we can modify our expression. For example, in this case, this expression also might be working:

(?:\/.+?\/)(.+?)(?:\/.+)

DEMO



Related Topics



Leave a reply



Submit