Text Editor(Sublime Text, Geany, Notepad++ etc.) Regex to remove all parameters from URL string except one parameter-value

点点圈 提交于 2021-02-08 10:46:28

问题


I am not very familiar with advanced matching patterns in Regex.

I have some Google Search URLs which I need to clean up without having to hold Backspace key for 5 seconds to remove unnecessary parameters from the URL.

Let's say I have this URL(could many different URLs following patterns like below):

https://www.google.com/search?source=hp&ei=Ne4pXpSIHIW_9QOD-rmADw&q=laravel+crud+generator&oq=laravel+crud+generator&gs_l=psy-ab.3..0l8.1294.6845..7289...1.0..0.307.3888.0j20j2j1......0....1..gws-wiz.....6..0i131j0i362i308i154i357.PwlZ_932pXo&ved=0ahUKEwjU9pz4tJrnAhWFX30KHQN9DvAQ4dUDCAU&uact=5

And I want to turn that into nice clean Search URL as below:

https://www.google.com/search?q=laravel+crud+generator

How can I acheive that using Find/Replace with Regex of any of mentioned text editors in Question ?


回答1:


I'm posting that others use the solution.

in notepad++ please press CTRL+H then select Regular expression on below.

Then place on Find what: this pattern: .+&(q=[^&]+).+ and in Replace with insert: https://www.google.com/search?$1

Now, easily press the Replace button for single replace or for all replacements press ALT+A or Replace All button.

Check Regex101

But description:

1- .+& find all characters before & following a q. So this part includes https://www.google.com/search?source=hp&ei=Ne4pXpSIHIW_9QOD-rmADw&

2- (q=[^&]+), our target! we want everything after q= up next &. So we search for a string which started with q= then any character which is not &. [^&] means a character that is not & and + is saying that any character that is not & more than zero time. this part will include q=laravel+crud+generator. Please notice the parentheses.

3- .+ means any character and includes &oq=laravel+crud+generator&gs_l=psy-ab.3..0l8.1294.6845..7289...1.0..0.307.3888.0j20j2j1......0....1..gws-wiz.....6..0i131j0i362i308i154i357.PwlZ_932pXo&ved=0ahUKEwjU9pz4tJrnAhWFX30KHQN9DvAQ4dUDCAU&uact=5

ok, remember () in section 2? that was a group. you can use groups in replacements by this pattern $groupNumber which groupNumber is the index of parentheses. Here we have just one () or actually just one group, so our replacement statement will be $1.

And finally replacement: https://www.google.com/search?$1 so everything is inside group one will replace with $1.




回答2:


Try replacing this pattern: (https://www.google.com/search\?).*(q=[^&]+).* with $1$2

Explanation:

  • (https://www.google.com/search\?) = matches the beginning of your specified string. Notice the escaped ? since it's a special character. Wrapped in parenthesis, this becomes capture group #1 (accessible by $1)
  • .* = this will match any characters and is also optional. Just to clear out anything between the start of the string and your q parameter
  • (q=[^&]+) = matches your q parameter up until the & symbol (indicating next parameter). Wrapped in parenthesis, this becomes capture group #2 (accessible by $2)
  • .* = this will match any characters and is also optional. This part clears out anything after your q parameter's value

Replacement:

  • $1$2 = Simply replaces your string with capture group 1 and capture group 2

** Tested in Notepad++ with sample string in question



来源:https://stackoverflow.com/questions/59885566/text-editorsublime-text-geany-notepad-etc-regex-to-remove-all-parameters

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!