how to find a search term in source code
IMHO there is a good answers on a similar question at "Unix & Linux":
grep works on pure text and does not know anything about the
underlying syntax of your C program. Therefore, in order not search
inside comments you have several options:
Strip C-comments before the search, you can do this using gcc
-fpreprocessed -dD -E yourfile.c For details, please see Remove comments from C/C++ codeWrite/use some hacky half-working scripts like you have already found
(e.g. they work by skipping lines starting with // or /*) in order to
handle the details of all possible C/C++ comments (again, see the
previous link for some scary testcases). Then you still may have false
positives, but you do not have to preprocess anything.Use more advanced tools for doing "semantic search" in the code. I
have found "coccigrep": http://home.regit.org/software/coccigrep/ This
kind of tools allows search for some specific language statements
(i.e. an update of a structure with given name) and certainly they
drop the comments.
https://unix.stackexchange.com/a/33136/158220
Although it doesn't completely cover your "not in strings" requirement.
How to find a particular string within a Source code(Xpath) and extract the proceeding text?
You can try to find script node and get its text with XPath:
node = html.select('//script[contains(., "[null,")]/text()').extract()
and then extract required substring:
node.split("[null,")[-1].split("]")[0]
How do you search through your own libraries of source code?
I usually use the search function in my IDE (nuSphere phpEd). It is reasonably fast, and allows me to filter by file types. Windows' search facility is useless, and somehow manages to get worse in every new version.
Anyway, I asked a question about programming-friendly search programs a while back. Maybe one of the answers helps.
How to search for a specific keyword in html page source code with BeautifulSoup?
If your only purpose is to see whether the keyword is present or not, then you don't need to construct a BeautifulSoup object.
from urllib import request
url_1 = "https://cheapchicsdesigns.com"
keyword ='cdn.secomapp.com'
page = request.urlopen(url_1)
print(keyword in page.read())
But I would recommend you to use requests
as it's more easy
import requests
url_1 = "https://cheapchicsdesigns.com"
keyword ='cdn.secomapp.com'
res = requests.get(url_1)
print(keyword in res.text)
Searching a functions source code
Here a solution using deparse
:
> grep ("sort", deparse(shapiro.test))
[1] 5
How to search the web for pages containing certain source code?
To sum up the other answers, it seems there is currently no way to search for text in HTML source code.
There is one exception ot this rule: if the code you search for is open-source and indexed by Google Code Search.
Code to find strings in source code over many urls
Reorganized your code a bit. The main culprit was whitespace. You need to trim your URL string before using it (i.e. trim($url);
).
Other changes:
- Set your search term outside the for loop, since it never changes.
- Setup the curl object outside the loop and reuse it by just changing the URL each time.
- Use curl_setopt_array() to set multiple curl options in one statement.
- Use a foreach loop, since you're iterating over the entire array anyway and the code is cleaner.
- Using stripos() is more efficient than strstr() and is case-insensitive anyway.
- Use the !== comparator to prevent implied typecasting (FALSE !== 0, but FALSE == 0).
- Check the returned $html string as curl_exec() can return FALSE if it fails.
- Close the curl object at the end (i.e. outside the if statement too).
The code below can be run on my quick mockup.
<html>
<body>
<form action="search.php" method="post">
URLs: <br/>
<textarea rows="20" cols="50" input type="text" name="url" /></textarea><br/>
Search Term: <br/>
<textarea rows="20" cols="50" input type="text" name="proxy" /></textarea><br/>
<input type="submit" />
</form>
<?
if(isset($_POST['url'])) {
set_time_limit (0);
$urls = explode("\n", $_POST['url']);
$term = $_POST['proxy'];
$options = array( CURLOPT_FOLLOWLOCATION => 1,
CURLOPT_RETURNTRANSFER => 1,
CURLOPT_CUSTOMREQUEST => 'GET',
CURLOPT_HEADER => 1,
);
$ch = curl_init();
curl_setopt_array($ch, $options);
foreach ($urls as $url) {
curl_setopt($ch, CURLOPT_URL, trim($url));
$html = curl_exec($ch);
if ($html !== FALSE && stristr($html, $term) !== FALSE) { // Found!
echo $url;
}
}
curl_close($ch);
}
?>
</body>
</html>
Related Topics
Double Delete in Initializer_List VS 2013
Beyond Stack Sampling: C++ Profilers
Std::Thread Calling Method of Class
Windows/C++: How to Find the Line of Code Where Exception Was Thrown Having "Exception Offset"
Popen Simultaneous Read and Write
How to Alter a Float by Its Smallest Increment (Or Close to It)
Delayed Start of a Thread in C++ 11
How to Throttle the Bandwidth of a Socket Connection in C
How to Implement a BéZier Curve in C++
Using Boost Thread and a Non-Static Class Function
What Is the Purpose of _Cxa_Pure_Virtual
Finding Out the CPU Clock Frequency (Per Core, Per Processor)
Boolean Values as 8 Bit in Compilers. Are Operations on Them Inefficient