What is the most elegant way in Ruby to remove a parameter from a URL?
The addressable gem will do this nicely; please see the superior answer by The Tin Man. But if you want to roll your own, here's how. The only claim this code has to elegance is that it hides the ugly in a method:
#!/usr/bin/ruby1.8
def reject_param(url, param_to_reject)
# Regex from RFC3986
url_regex = %r"^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?$"
raise "Not a url: #{url}" unless url =~ url_regex
scheme_plus_punctuation = $1
authority_with_punctuation = $3
path = $5
query = $7
fragment = $9
query = query.split('&').reject do |param|
param_name = param.split(/[=;]/).first
param_name == param_to_reject
end.join('&')
[scheme_plus_punctuation, authority_with_punctuation, path, '?', query, fragment].join
end
url = "http://example.com/path?param1=one¶m2=2¶m3=something3"
p url
p reject_param(url, 'param2')
# => "http://example.com/path?param1=one¶m2=2¶m3=something3"
# => "http://example.com/path?param1=one¶m3=something3"
Rails: Appending URL parameters & removing URL parameters
If you want to append the current parameters, you could try this:
users_path(params.merge(:b => 'goat'))
You may want to write a helper method that does this for you:
def merged_with_current_params(additional)
params.merge(additional)
end
As to the second part of your question, you probably want to expand the incoming params
into a series of checkboxes with the names and values set appropriately. Disabling the checkbox and submitting the form would have the effect of removing that param from the request.
To remove :page
parameter, add this to your helper instead:
params.except(:page).merge(additional)
Remove specific parts from url
possibly the safest and best one. use URI.
URI("https://www.youtube.com/watch/34345?v=rwmEkvPBG1s").path.split("/").last
For more refer How to extract URL parameters from a URL with Ruby or Rails?
Remove trailing ? from a string
The answers thus far all regard the string itself. What you're actually doing is telling it that it has the params ""
. If you make it nil
if params.to_param == ""
you won't have that problem.
def url_without_locale_params(url)
uri = URI url
params = Rack::Utils.parse_query uri.query
params.delete 'locale'
uri.query = params.to_param.blank? ? nil : params.to_param
uri.to_s
end
something like that should do the trick. The reason for this is that even with an empty string, URI assumes there's something to be appended, so it puts the initial ?
on.
How to remove a part of string?
Try this ...
str = "POST /test/userRegistration?id=1234&name=John&address=UK"
str = str.sub(/&name=.+&/, '&')
str
=> "POST /test/userRegistration?id=1234&address=UK"
What is the best practice to remove url parameters from web request log in PySpark?
Avoid using UDF if possible. UDF is like a black box to pyspark and thus spark cannot efficiently apply optimizations on them. For details please read this.
Rather than using Udfs, you can directly use pyspark's sql functions.
from pyspark.sql.functions import split
# from urllib.parse import urlsplit
split_with_question_mark = split(sdf.weblog, '\\?')
param_separated_df = sdf.withColumn("before_param", split_with_question_mark[0]).withColumn("after_param", split_with_question_mark[1])
param_separated_df.show(truncate=False)
Result:
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|weblog |before_param |after_param |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|[03/Oct/2021:09:26:37 +0000] SsAzIiWuV1Bw9CtthtxTtav8VdmP3N2jkJ/ZTsx6u8ATOC8HFwxKYmWwMrwl6t7heGKU7+Q== user_ZwfikI/2BdNcrhkwWai/bh+zX66co70YwGKAigzuLTW4khCvc1LLmFN1aBH7K0Loq8g== "HEAD /xxxx/pub/ping?xxxx-client=005 HTTP/1.1" 200 "-b" 53b 7ms "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)" WepX20WkyvTydOpOuk/IDIVsxN+4zOZbRzng== 50000 - - |[03/Oct/2021:09:26:37 +0000] SsAzIiWuV1Bw9CtthtxTtav8VdmP3N2jkJ/ZTsx6u8ATOC8HFwxKYmWwMrwl6t7heGKU7+Q== user_ZwfikI/2BdNcrhkwWai/bh+zX66co70YwGKAigzuLTW4khCvc1LLmFN1aBH7K0Loq8g== "HEAD /xxxx/pub/ping|xxxx-client=005 HTTP/1.1" 200 "-b" 53b 7ms "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)" WepX20WkyvTydOpOuk/IDIVsxN+4zOZbRzng== 50000 - - |
|[03/Oct/2021:00:19:24 +0000] W+APDZiRZIOjc/gmklDpL95WFxwkMRGthMXLnLDxbNZ6qZA== xxxxx.xxx.xxxx.corp "GET /xxxx/d5d/data/v10/notification_events/NotifcationEventCollection?$format=json&$filter=%20%20%2%20%20StartDate%20eq%20datetime"2021-03-24T00:15:05"%20and%20substringof("dude",SystemRoles)&$expand=MailLog&$skiptoken=3701%20 HTTP/1.1" 200 "-b" 7273b 391ms "python-requests/2.25.1" soso80-emea.xxxx.corp 50001 - -|[03/Oct/2021:00:19:24 +0000] W+APDZiRZIOjc/gmklDpL95WFxwkMRGthMXLnLDxbNZ6qZA== xxxxx.xxx.xxxx.corp "GET /xxxx/d5d/data/v10/notification_events/NotifcationEventCollection |$format=json&$filter=%20%20%2%20%20StartDate%20eq%20datetime"2021-03-24T00:15:05"%20and%20substringof("dude",SystemRoles)&$expand=MailLog&$skiptoken=3701%20 HTTP/1.1" 200 "-b" 7273b 391ms "python-requests/2.25.1" soso80-emea.xxxx.corp 50001 - -|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Once you have separated the before Query Url, You can split the after query part by Http method type i.e HTTP/1.1
to get the query parameters.
import pyspark.sql.functions as func
separated_by_comma = param_separated_df.withColumn("query_param", func.split(param_separated_df["after_param"], 'HTTP/1.1')[0]);
separated_by_comma.show(truncate=False)
Result:
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
|weblog |before_param |after_param |query_param |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
|[03/Oct/2021:09:26:37 +0000] SsAzIiWuV1Bw9CtthtxTtav8VdmP3N2jkJ/ZTsx6u8ATOC8HFwxKYmWwMrwl6t7heGKU7+Q== user_ZwfikI/2BdNcrhkwWai/bh+zX66co70YwGKAigzuLTW4khCvc1LLmFN1aBH7K0Loq8g== "HEAD /xxxx/pub/ping?xxxx-client=005 HTTP/1.1" 200 "-b" 53b 7ms "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)" WepX20WkyvTydOpOuk/IDIVsxN+4zOZbRzng== 50000 - - |[03/Oct/2021:09:26:37 +0000] SsAzIiWuV1Bw9CtthtxTtav8VdmP3N2jkJ/ZTsx6u8ATOC8HFwxKYmWwMrwl6t7heGKU7+Q== user_ZwfikI/2BdNcrhkwWai/bh+zX66co70YwGKAigzuLTW4khCvc1LLmFN1aBH7K0Loq8g== "HEAD /xxxx/pub/ping|xxxx-client=005 HTTP/1.1" 200 "-b" 53b 7ms "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)" WepX20WkyvTydOpOuk/IDIVsxN+4zOZbRzng== 50000 - - |xxxx-client=005 |
|[03/Oct/2021:00:19:24 +0000] W+APDZiRZIOjc/gmklDpL95WFxwkMRGthMXLnLDxbNZ6qZA== xxxxx.xxx.xxxx.corp "GET /xxxx/d5d/data/v10/notification_events/NotifcationEventCollection?$format=json&$filter=%20%20%2%20%20StartDate%20eq%20datetime"2021-03-24T00:15:05"%20and%20substringof("dude",SystemRoles)&$expand=MailLog&$skiptoken=3701%20 HTTP/1.1" 200 "-b" 7273b 391ms "python-requests/2.25.1" soso80-emea.xxxx.corp 50001 - -|[03/Oct/2021:00:19:24 +0000] W+APDZiRZIOjc/gmklDpL95WFxwkMRGthMXLnLDxbNZ6qZA== xxxxx.xxx.xxxx.corp "GET /xxxx/d5d/data/v10/notification_events/NotifcationEventCollection |$format=json&$filter=%20%20%2%20%20StartDate%20eq%20datetime"2021-03-24T00:15:05"%20and%20substringof("dude",SystemRoles)&$expand=MailLog&$skiptoken=3701%20 HTTP/1.1" 200 "-b" 7273b 391ms "python-requests/2.25.1" soso80-emea.xxxx.corp 50001 - -|$format=json&$filter=%20%20%2%20%20StartDate%20eq%20datetime"2021-03-24T00:15:05"%20and%20substringof("dude",SystemRoles)&$expand=MailLog&$skiptoken=3701%20 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
All the above changes are made in collab that you shared.
How can I remove Google tracking parameters (UTM) from an URL?
You can apply a regex to the urls to clean them up. Something like this should do the trick:
url = 'http://houseofbuttons.tumblr.com/post/22326009438?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+HouseOfButtons+%28House+of+Buttons%29&normal_param=1'
url.gsub(/&?utm_.+?(&|$)/, '') => "http://houseofbuttons.tumblr.com/post/22326009438?normal_param=1"
Removing resources from url
You can set the path attribute on the reources
resources :pages, :path => '' do
I found this article very helpful in customizing my url's http://jasoncodes.com/posts/rails-3-nested-resource-slugs
There is also a great gem for getting rid of the id's and customizing the slug, friendly_id's, http://railscasts.com/episodes/314-pretty-urls-with-friendlyid
Related Topics
How to Sort a Ruby Hash Alphabetically by Keys
Running Phantomjs from a Ruby on Rails Application
Ruby Gem Listed, But Won't Load (Gem in User Dir, Not Ruby Dir)
What Is the Point of Object#Presence in Rails
Ruby: How to Calculate a Path Relative to Another One
Add_Foreign_Key VS Add_Reference in Rails
Best Way to Handle Data Attributes in Slim
How to See the Ruby Code in a Proc
Rails: Difference Between Env.Fetch() and Env[]
Decoding Facebook's Signed Request in Ruby/Sinatra
How to Tell If I'm Running from Jruby VS. Ruby
How to Get Rspec to Run All Tests Nested Under a Folder
How to Pluck Email from Array of Users
Rake Db:Migration Not Working on Travis-Ci Build