What to do with a community URL style like Last.FM or Wikipedia?

拈花ヽ惹草 提交于 2019-12-22 09:49:24

问题


I'm trying to understand how I should work with characters in URLs, this because I'm building a site where the user can store content and go to the content's page by digiting it's name in the URL.

so, something like Wikipedia or Last.FM website.

I see in the site, user can write something like
http://it.wikipedia.org/wiki/Trentemøller and the page of the artist can reached.

after the page is loaded, if I copy the URL i see written as:
http://it.wikipedia.org/wiki/Trentemøller but if I paste it into a text editor, it will be pasted as
http://it.wikipedia.org/wiki/Trentem%C3%B8ller

so the char ø is pasted as %C3%B8

of course the same is for URLs like this (the page of the artist Takeshi Kobayashi)

http://www.last.fm/music/小林武史
http://www.last.fm/music/%E5%B0%8F%E6%9E%97%E6%AD%A6%E5%8F%B2

If I digit the first or the second, the page works in any case, why?

I think I should do something with the .htacces and mod_rewrite but I'm not sure, are the special chars automatically converted to the url special chars?

and then, how can I do to let PHP do the right query with the content name?

if I have a table like

table_users
- username
- age
- height
- weight
- sex
- email
- country

I'm able with mod_rewrite to write an address like http://mysite.com/user/bob to get the username bob from the table_users but what about http://mysite.com/user/小林武史?

here I show a simple example of what I think to do:

#.htaccess
RewriteEngine On
RewriteRule ^(user/)([a-zA-Z0-9_+-]+)([/]?)$ user.php?username=$2

<?php
// this is the page user.php
// this is the way I use to get the url value
print $_REQUEST["username"];
?>

this works, but it's limited to [a-zA-Z0-9_+-], how to be more compatible with all chars like the others without loss too much security?

Did someone know some way to avoid troubles?


回答1:


Most browsers urlencode() 小林武史 to %E5%B0%8F%E6%9E%97%E6%AD%A6%E5%8F%B2.

Reguarding your .htaccess mod_rewrite rules, have you considered using something like:

RewriteEngine On
RewriteRule ^(user/)(.+?)[/]?$ user.php?username=$2



回答2:


Try urlencode and urldecode

Edit :

Here is Visualy the Description of url encoding and decoding

http://blog.neraliu.com/wp-content/uploads/2009/10/url-encoding.png




回答3:


As far as I understand every URL with not ASCII characters is mapped to unique ASCII based url. This is actually a feature on the client side. Please look at: http://kmeleon.sourceforge.net/bugs/viewbug.php?bugid=631 to see examples and link to RFCs coverting this one.



来源:https://stackoverflow.com/questions/2128756/what-to-do-with-a-community-url-style-like-last-fm-or-wikipedia

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!