How to Find a Search Term in Source Code

how to find a search term in source code

IMHO there is a good answers on a similar question at "Unix & Linux":

grep works on pure text and does not know anything about the
underlying syntax of your C program. Therefore, in order not search
inside comments you have several options:

  1. Strip C-comments before the search, you can do this using gcc
    -fpreprocessed -dD -E yourfile.c For details, please see Remove comments from C/C++ code

  2. Write/use some hacky half-working scripts like you have already found
    (e.g. they work by skipping lines starting with // or /*) in order to
    handle the details of all possible C/C++ comments (again, see the
    previous link for some scary testcases). Then you still may have false
    positives, but you do not have to preprocess anything.

  3. Use more advanced tools for doing "semantic search" in the code. I
    have found "coccigrep": http://home.regit.org/software/coccigrep/ This
    kind of tools allows search for some specific language statements
    (i.e. an update of a structure with given name) and certainly they
    drop the comments.

https://unix.stackexchange.com/a/33136/158220

Although it doesn't completely cover your "not in strings" requirement.

How to find a particular string within a Source code(Xpath) and extract the proceeding text?

You can try to find script node and get its text with XPath:

node = html.select('//script[contains(., "[null,")]/text()').extract()

and then extract required substring:

node.split("[null,")[-1].split("]")[0]

How do you search through your own libraries of source code?

I usually use the search function in my IDE (nuSphere phpEd). It is reasonably fast, and allows me to filter by file types. Windows' search facility is useless, and somehow manages to get worse in every new version.

Anyway, I asked a question about programming-friendly search programs a while back. Maybe one of the answers helps.

How to search for a specific keyword in html page source code with BeautifulSoup?

If your only purpose is to see whether the keyword is present or not, then you don't need to construct a BeautifulSoup object.

from urllib import request

url_1 = "https://cheapchicsdesigns.com"
keyword ='cdn.secomapp.com'
page = request.urlopen(url_1)

print(keyword in page.read())

But I would recommend you to use requests as it's more easy

import requests

url_1 = "https://cheapchicsdesigns.com"
keyword ='cdn.secomapp.com'

res = requests.get(url_1)

print(keyword in res.text)

Searching a functions source code

Here a solution using deparse:

> grep ("sort", deparse(shapiro.test))
[1] 5

How to search the web for pages containing certain source code?

To sum up the other answers, it seems there is currently no way to search for text in HTML source code.

There is one exception ot this rule: if the code you search for is open-source and indexed by Google Code Search.

Code to find strings in source code over many urls

Reorganized your code a bit. The main culprit was whitespace. You need to trim your URL string before using it (i.e. trim($url);).

Other changes:

  • Set your search term outside the for loop, since it never changes.
  • Setup the curl object outside the loop and reuse it by just changing the URL each time.
  • Use curl_setopt_array() to set multiple curl options in one statement.
  • Use a foreach loop, since you're iterating over the entire array anyway and the code is cleaner.
  • Using stripos() is more efficient than strstr() and is case-insensitive anyway.
  • Use the !== comparator to prevent implied typecasting (FALSE !== 0, but FALSE == 0).
  • Check the returned $html string as curl_exec() can return FALSE if it fails.
  • Close the curl object at the end (i.e. outside the if statement too).

The code below can be run on my quick mockup.

<html>
<body>

<form action="search.php" method="post">
URLs: <br/>
<textarea rows="20" cols="50" input type="text" name="url" /></textarea><br/>

Search Term: <br/>
<textarea rows="20" cols="50" input type="text" name="proxy" /></textarea><br/>

<input type="submit" />
</form>

<?
if(isset($_POST['url'])) {
set_time_limit (0);

$urls = explode("\n", $_POST['url']);
$term = $_POST['proxy'];
$options = array( CURLOPT_FOLLOWLOCATION => 1,
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_CUSTOMREQUEST => 'GET',
CURLOPT_HEADER => 1,
);
$ch = curl_init();
curl_setopt_array($ch, $options);

foreach ($urls as $url) {
curl_setopt($ch, CURLOPT_URL, trim($url));
$html = curl_exec($ch);

if ($html !== FALSE && stristr($html, $term) !== FALSE) { // Found!
echo $url;
}
}

curl_close($ch);
}
?>

</body>
</html>


Related Topics



Leave a reply



Submit