php - Header 404 vs Header 400: url parsing error -
i'm writing own little php framework. want write semantic be, , i'm stacked.
i've got url
parsing class
. parse whole url (scheme, subdomain, domain, resource , query). next router
class decides url
. if there resources corresponding url
"renders" it, if not render 404, if resource forbidden renders 403, etc... problem:
let's site under: http://en.mysite.com
. lets pages asd
, &*%
not exist. i've got 2 url's:
http://en.mysite.com/asd http://en.mysite.com/&*%($^&#
of course both sites doesn't exists. should headers like? i'm predicting that:
http://en.mysite.com/asd // header 404 page not found http://en.mysite.com/&*% // header 400 bad request
however (based on our guru site):
http://stackoverflow.com/<< // header 404 http://stackoverflow.com/&;: // header 404 http://stackoverflow.com/&*%($%5e&# // header 400 (which btw not styled...) https://www.google.com/%&*(#$*%&@^ // header 404...
what rule? should every system predict symbols ok url? me url should containt [a-z0-9-_.#!]+
. i'm using slashes paramters, dont need ? = &
. but general rule? there url regex in specification?
btw: put 404 , go drink bear: :).
but problem kind of serious in case of seo. 400 quite not same 404 in case of positioning. , nice style 400 page own way, , say not "page not found" "are trying inject beautiful url? it bad request!
as far can tell ietf rfc2616, 400 should returned requests mallformed (i.e. not conform ietf rfc3986, whereas 404 should returned resources not exist (410 should returned resources once existed have gone).
in above examples url's %-sign not followed 2 hexadecimal characters mallformed (e.g. en.mysite.com/&%($^&#
, www.google.com/%&(#$*%&@^
). malformed queries have 2 ?
(question mark signs) in last part.
a regular expression urls can found in response question: php validation/regex url.
Comments
Post a Comment