What Is the Most Elegant Way in Ruby to Remove a Parameter from a Url

What is the most elegant way in Ruby to remove a parameter from a URL?

The addressable gem will do this nicely; please see the superior answer by The Tin Man. But if you want to roll your own, here's how. The only claim this code has to elegance is that it hides the ugly in a method:

#!/usr/bin/ruby1.8

def reject_param(url, param_to_reject)
# Regex from RFC3986
url_regex = %r"^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?$"
raise "Not a url: #{url}" unless url =~ url_regex
scheme_plus_punctuation = $1
authority_with_punctuation = $3
path = $5
query = $7
fragment = $9
query = query.split('&').reject do |param|
param_name = param.split(/[=;]/).first
param_name == param_to_reject
end.join('&')
[scheme_plus_punctuation, authority_with_punctuation, path, '?', query, fragment].join
end

url = "http://example.com/path?param1=one¶m2=2¶m3=something3"
p url
p reject_param(url, 'param2')

# => "http://example.com/path?param1=one¶m2=2¶m3=something3"
# => "http://example.com/path?param1=one¶m3=something3"

Rails: Appending URL parameters & removing URL parameters

If you want to append the current parameters, you could try this:

users_path(params.merge(:b => 'goat'))

You may want to write a helper method that does this for you:

def merged_with_current_params(additional)
params.merge(additional)
end

As to the second part of your question, you probably want to expand the incoming params into a series of checkboxes with the names and values set appropriately. Disabling the checkbox and submitting the form would have the effect of removing that param from the request.

To remove :page parameter, add this to your helper instead:

params.except(:page).merge(additional)

Remove specific parts from url

possibly the safest and best one. use URI.

URI("https://www.youtube.com/watch/34345?v=rwmEkvPBG1s").path.split("/").last

For more refer How to extract URL parameters from a URL with Ruby or Rails?

Remove trailing ? from a string

The answers thus far all regard the string itself. What you're actually doing is telling it that it has the params "". If you make it nil if params.to_param == "" you won't have that problem.

def url_without_locale_params(url)
uri = URI url
params = Rack::Utils.parse_query uri.query
params.delete 'locale'
uri.query = params.to_param.blank? ? nil : params.to_param
uri.to_s
end

something like that should do the trick. The reason for this is that even with an empty string, URI assumes there's something to be appended, so it puts the initial ? on.

How to remove a part of string?

Try this ...

str = "POST /test/userRegistration?id=1234&name=John&address=UK"
str = str.sub(/&name=.+&/, '&')
str
=> "POST /test/userRegistration?id=1234&address=UK"

What is the best practice to remove url parameters from web request log in PySpark?

Avoid using UDF if possible. UDF is like a black box to pyspark and thus spark cannot efficiently apply optimizations on them. For details please read this.

Rather than using Udfs, you can directly use pyspark's sql functions.


from pyspark.sql.functions import split
# from urllib.parse import urlsplit
split_with_question_mark = split(sdf.weblog, '\\?')
param_separated_df = sdf.withColumn("before_param", split_with_question_mark[0]).withColumn("after_param", split_with_question_mark[1])
param_separated_df.show(truncate=False)

Result:


+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|weblog |before_param |after_param |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|[03/Oct/2021:09:26:37 +0000] SsAzIiWuV1Bw9CtthtxTtav8VdmP3N2jkJ/ZTsx6u8ATOC8HFwxKYmWwMrwl6t7heGKU7+Q== user_ZwfikI/2BdNcrhkwWai/bh+zX66co70YwGKAigzuLTW4khCvc1LLmFN1aBH7K0Loq8g== "HEAD /xxxx/pub/ping?xxxx-client=005 HTTP/1.1" 200 "-b" 53b 7ms "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)" WepX20WkyvTydOpOuk/IDIVsxN+4zOZbRzng== 50000 - - |[03/Oct/2021:09:26:37 +0000] SsAzIiWuV1Bw9CtthtxTtav8VdmP3N2jkJ/ZTsx6u8ATOC8HFwxKYmWwMrwl6t7heGKU7+Q== user_ZwfikI/2BdNcrhkwWai/bh+zX66co70YwGKAigzuLTW4khCvc1LLmFN1aBH7K0Loq8g== "HEAD /xxxx/pub/ping|xxxx-client=005 HTTP/1.1" 200 "-b" 53b 7ms "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)" WepX20WkyvTydOpOuk/IDIVsxN+4zOZbRzng== 50000 - - |
|[03/Oct/2021:00:19:24 +0000] W+APDZiRZIOjc/gmklDpL95WFxwkMRGthMXLnLDxbNZ6qZA== xxxxx.xxx.xxxx.corp "GET /xxxx/d5d/data/v10/notification_events/NotifcationEventCollection?$format=json&$filter=%20%20%2%20%20StartDate%20eq%20datetime"2021-03-24T00:15:05"%20and%20substringof("dude",SystemRoles)&$expand=MailLog&$skiptoken=3701%20 HTTP/1.1" 200 "-b" 7273b 391ms "python-requests/2.25.1" soso80-emea.xxxx.corp 50001 - -|[03/Oct/2021:00:19:24 +0000] W+APDZiRZIOjc/gmklDpL95WFxwkMRGthMXLnLDxbNZ6qZA== xxxxx.xxx.xxxx.corp "GET /xxxx/d5d/data/v10/notification_events/NotifcationEventCollection |$format=json&$filter=%20%20%2%20%20StartDate%20eq%20datetime"2021-03-24T00:15:05"%20and%20substringof("dude",SystemRoles)&$expand=MailLog&$skiptoken=3701%20 HTTP/1.1" 200 "-b" 7273b 391ms "python-requests/2.25.1" soso80-emea.xxxx.corp 50001 - -|
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Once you have separated the before Query Url, You can split the after query part by Http method type i.e HTTP/1.1 to get the query parameters.

import pyspark.sql.functions as func

separated_by_comma = param_separated_df.withColumn("query_param", func.split(param_separated_df["after_param"], 'HTTP/1.1')[0]);
separated_by_comma.show(truncate=False)

Result:

+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
|weblog |before_param |after_param |query_param |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
|[03/Oct/2021:09:26:37 +0000] SsAzIiWuV1Bw9CtthtxTtav8VdmP3N2jkJ/ZTsx6u8ATOC8HFwxKYmWwMrwl6t7heGKU7+Q== user_ZwfikI/2BdNcrhkwWai/bh+zX66co70YwGKAigzuLTW4khCvc1LLmFN1aBH7K0Loq8g== "HEAD /xxxx/pub/ping?xxxx-client=005 HTTP/1.1" 200 "-b" 53b 7ms "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)" WepX20WkyvTydOpOuk/IDIVsxN+4zOZbRzng== 50000 - - |[03/Oct/2021:09:26:37 +0000] SsAzIiWuV1Bw9CtthtxTtav8VdmP3N2jkJ/ZTsx6u8ATOC8HFwxKYmWwMrwl6t7heGKU7+Q== user_ZwfikI/2BdNcrhkwWai/bh+zX66co70YwGKAigzuLTW4khCvc1LLmFN1aBH7K0Loq8g== "HEAD /xxxx/pub/ping|xxxx-client=005 HTTP/1.1" 200 "-b" 53b 7ms "Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)" WepX20WkyvTydOpOuk/IDIVsxN+4zOZbRzng== 50000 - - |xxxx-client=005 |
|[03/Oct/2021:00:19:24 +0000] W+APDZiRZIOjc/gmklDpL95WFxwkMRGthMXLnLDxbNZ6qZA== xxxxx.xxx.xxxx.corp "GET /xxxx/d5d/data/v10/notification_events/NotifcationEventCollection?$format=json&$filter=%20%20%2%20%20StartDate%20eq%20datetime"2021-03-24T00:15:05"%20and%20substringof("dude",SystemRoles)&$expand=MailLog&$skiptoken=3701%20 HTTP/1.1" 200 "-b" 7273b 391ms "python-requests/2.25.1" soso80-emea.xxxx.corp 50001 - -|[03/Oct/2021:00:19:24 +0000] W+APDZiRZIOjc/gmklDpL95WFxwkMRGthMXLnLDxbNZ6qZA== xxxxx.xxx.xxxx.corp "GET /xxxx/d5d/data/v10/notification_events/NotifcationEventCollection |$format=json&$filter=%20%20%2%20%20StartDate%20eq%20datetime"2021-03-24T00:15:05"%20and%20substringof("dude",SystemRoles)&$expand=MailLog&$skiptoken=3701%20 HTTP/1.1" 200 "-b" 7273b 391ms "python-requests/2.25.1" soso80-emea.xxxx.corp 50001 - -|$format=json&$filter=%20%20%2%20%20StartDate%20eq%20datetime"2021-03-24T00:15:05"%20and%20substringof("dude",SystemRoles)&$expand=MailLog&$skiptoken=3701%20 |
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+

All the above changes are made in collab that you shared.

How can I remove Google tracking parameters (UTM) from an URL?

You can apply a regex to the urls to clean them up. Something like this should do the trick:

url = 'http://houseofbuttons.tumblr.com/post/22326009438?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+HouseOfButtons+%28House+of+Buttons%29&normal_param=1'
url.gsub(/&?utm_.+?(&|$)/, '') => "http://houseofbuttons.tumblr.com/post/22326009438?normal_param=1"

Removing resources from url

You can set the path attribute on the reources

 resources :pages, :path => '' do

I found this article very helpful in customizing my url's http://jasoncodes.com/posts/rails-3-nested-resource-slugs

There is also a great gem for getting rid of the id's and customizing the slug, friendly_id's, http://railscasts.com/episodes/314-pretty-urls-with-friendlyid



Related Topics



Leave a reply



Submit