module URI
URI is a module providing classes to handle Uniform Resource Identifiers (RFC2396)
Features¶ ↑
-
Uniform handling of handling URIs
-
Flexibility to introduce custom URI schemes
-
Flexibility to have an alternate URI::Parser (or just different patterns and regexp's)
Basic example¶ ↑
require 'uri' uri = URI("http://foo.com/posts?id=30&limit=5#time=1305298413") #=> #<URI::HTTP:0x00000000b14880 URL:http://foo.com/posts?id=30&limit=5#time=1305298413> uri.scheme #=> "http" uri.host #=> "foo.com" uri.path #=> "/posts" uri.query #=> "id=30&limit=5" uri.fragment #=> "time=1305298413" uri.to_s #=> "http://foo.com/posts?id=30&limit=5#time=1305298413"
Adding custom URIs¶ ↑
module URI class RSYNC < Generic DEFAULT_PORT = 873 end @@schemes['RSYNC'] = RSYNC end #=> URI::RSYNC URI.scheme_list #=> {"FTP"=>URI::FTP, "HTTP"=>URI::HTTP, "HTTPS"=>URI::HTTPS, "LDAP"=>URI::LDAP, "LDAPS"=>URI::LDAPS, "MAILTO"=>URI::MailTo, "RSYNC"=>URI::RSYNC} uri = URI("rsync://rsync.foo.com") #=> #<URI::RSYNC:0x00000000f648c8 URL:rsync://rsync.foo.com>
RFC References¶ ↑
A good place to view an RFC spec is www.ietf.org/rfc.html
Here is a list of all related RFC's.
Class tree¶ ↑
-
URI::Generic (in uri/generic.rb)
-
URI::FTP - (in uri/ftp.rb)
-
URI::HTTP - (in uri/http.rb)
-
URI::HTTPS - (in uri/https.rb)
-
-
URI::LDAP - (in uri/ldap.rb)
-
URI::LDAPS - (in uri/ldaps.rb)
-
-
URI::MailTo - (in uri/mailto.rb)
-
-
URI::Parser - (in uri/common.rb)
-
URI::REGEXP - (in uri/common.rb)
-
URI::REGEXP::PATTERN - (in uri/common.rb)
-
-
URI::Util - (in uri/common.rb)
-
URI::Escape - (in uri/common.rb)
-
URI::Error - (in uri/common.rb)
-
URI::InvalidURIError - (in uri/common.rb)
-
URI::InvalidComponentError - (in uri/common.rb)
-
URI::BadURIError - (in uri/common.rb)
-
Copyright Info¶ ↑
- Author
-
Akira Yamada <akira@ruby-lang.org>
- Documentation
-
Akira Yamada <akira@ruby-lang.org> Dmitry V. Sabanin <sdmitry@lrn.ru> Vincent Batts <vbatts@hashbangbash.com>
- License
-
Copyright © 2001 akira yamada <akira@ruby-lang.org> You can redistribute it and/or modify it under the same term as Ruby.
- Revision
-
$Id$
Constants
- DEFAULT_PARSER
Public Class Methods
Decode URL-encoded form data from given str
.
This decodes application/x-www-form-urlencoded data and returns array of key-value array. This internally uses ::decode_www_form_component.
charset hack is not supported now because the mapping from given charset to Ruby's encoding is not clear yet. see also www.w3.org/TR/html5/syntax.html#character-encodings-0
This refers www.w3.org/TR/html5/forms.html#url-encoded-form-data
ary = ::decode_www_form(“a=1&a=2&b=3”) p ary #=> [['a', '1'], ['a', '2'], ['b', '3']] p ary.assoc('a').last #=> '1' p ary.assoc('b').last #=> '3' p ary.rassoc('a').last #=> '2' p Hash # => {“a”=>“2”, “b”=>“3”}
See ::decode_www_form_component, ::encode_www_form
# File lib/uri/common.rb, line 974 def self.decode_www_form(str, enc=Encoding::UTF_8) return [] if str.empty? unless /\A#{WFKV_}=#{WFKV_}(?:[;&]#{WFKV_}=#{WFKV_})*\z/o =~ str raise ArgumentError, "invalid data of application/x-www-form-urlencoded (#{str})" end ary = [] $&.scan(/([^=;&]+)=([^;&]*)/) do ary << [decode_www_form_component($1, enc), decode_www_form_component($2, enc)] end ary end
Decode given str
of URL-encoded form data.
This decodes + to SP.
See ::encode_www_form_component, ::decode_www_form
# File lib/uri/common.rb, line 897 def self.decode_www_form_component(str, enc=Encoding::UTF_8) raise ArgumentError, "invalid %-encoding (#{str})" unless /\A[^%]*(?:%\h\h[^%]*)*\z/ =~ str str.dup.force_encoding("ASCII-8BIT") .gsub(/\+|%\h\h/, TBLDECWWWCOMP_) .force_encoding(enc) end
Generate URL-encoded form data from given enum
.
This generates application/x-www-form-urlencoded data defined in HTML5 from given an Enumerable object.
This internally uses ::encode_www_form_component.
This method doesn't convert the encoding of given items, so convert them before call this method if you want to send data as other than original encoding or mixed encoding data. (Strings which are encoded in an HTML5 ASCII incompatible encoding are converted to UTF-8.)
This method doesn't handle files. When you send a file, use multipart/form-data.
This is an implementation of www.w3.org/TR/html5/forms.html#url-encoded-form-data
URI.encode_www_form([["q", "ruby"], ["lang", "en"]]) #=> "q=ruby&lang=en" URI.encode_www_form("q" => "ruby", "lang" => "en") #=> "q=ruby&lang=en" URI.encode_www_form("q" => ["ruby", "perl"], "lang" => "en") #=> "q=ruby&q=perl&lang=en" URI.encode_www_form([["q", "ruby"], ["q", "perl"], ["lang", "en"]]) #=> "q=ruby&q=perl&lang=en"
See ::encode_www_form_component, ::decode_www_form
# File lib/uri/common.rb, line 932 def self.encode_www_form(enum) enum.map do |k,v| if v.nil? encode_www_form_component(k) elsif v.respond_to?(:to_ary) v.to_ary.map do |w| str = encode_www_form_component(k) unless w.nil? str << '=' str << encode_www_form_component(w) end end.join('&') else str = encode_www_form_component(k) str << '=' str << encode_www_form_component(v) end end.join('&') end
Encode given str
to URL-encoded form data.
This method doesn't convert *, -, ., 0-9, A-Z, _, a-z, but does convert SP (ASCII space) to + and converts others to %XX.
This is an implementation of www.w3.org/TR/html5/association-of-controls-and-forms.html#url-encoded-form-data
See ::decode_www_form_component, ::encode_www_form
# File lib/uri/common.rb, line 880 def self.encode_www_form_component(str) str = str.to_s if HTML5ASCIIINCOMPAT.include?(str.encoding) str = str.encode(Encoding::UTF_8) else str = str.dup end str.force_encoding(Encoding::ASCII_8BIT) str.gsub!(/[^*\-.0-9A-Z_a-z]/, TBLENCWWWCOMP_) str.force_encoding(Encoding::US_ASCII) end
Synopsis¶ ↑
URI::extract(str[, schemes][,&blk])
Args¶ ↑
str
-
String to extract URIs from.
schemes
-
Limit URI matching to a specific schemes.
Description¶ ↑
Extracts URIs from a string. If block given, iterates through all matched URIs. Returns nil if block given or array with matches.
Usage¶ ↑
require "uri" URI.extract("text here http://foo.example.org/bla and here mailto:test@example.com and here also.") # => ["http://foo.example.com/bla", "mailto:test@example.com"]
# File lib/uri/common.rb, line 812 def self.extract(str, schemes = nil, &block) DEFAULT_PARSER.extract(str, schemes, &block) end
Synopsis¶ ↑
URI::join(str[, str, ...])
Args¶ ↑
str
-
String(s) to work with
Description¶ ↑
Joins URIs.
Usage¶ ↑
require 'uri' p URI.join("http://example.com/","main.rbx") # => #<URI::HTTP:0x2022ac02 URL:http://localhost/main.rbx> p URI.join('http://example.com', 'foo') # => #<URI::HTTP:0x01ab80a0 URL:http://example.com/foo> p URI.join('http://example.com', '/foo', '/bar') # => #<URI::HTTP:0x01aaf0b0 URL:http://example.com/bar> p URI.join('http://example.com', '/foo', 'bar') # => #<URI::HTTP:0x801a92af0 URL:http://example.com/bar> p URI.join('http://example.com', '/foo/', 'bar') # => #<URI::HTTP:0x80135a3a0 URL:http://example.com/foo/bar>
# File lib/uri/common.rb, line 784 def self.join(*str) DEFAULT_PARSER.join(*str) end
Synopsis¶ ↑
URI::parse(uri_str)
Args¶ ↑
uri_str
-
String with URI.
Description¶ ↑
Creates one of the URI's subclasses instance from the string.
Raises¶ ↑
Raised if URI given is not a correct one.
Usage¶ ↑
require 'uri' uri = URI.parse("http://www.ruby-lang.org/") p uri # => #<URI::HTTP:0x202281be URL:http://www.ruby-lang.org/> p uri.scheme # => "http" p uri.host # => "www.ruby-lang.org"
# File lib/uri/common.rb, line 746 def self.parse(uri) DEFAULT_PARSER.parse(uri) end
Synopsis¶ ↑
URI::regexp([match_schemes])
Args¶ ↑
match_schemes
-
Array of schemes. If given, resulting regexp matches to URIs whose scheme is one of the match_schemes.
Description¶ ↑
Returns a Regexp object which matches to URI-like strings. The Regexp object returned by this method includes arbitrary number of capture group (parentheses). Never rely on it's number.
Usage¶ ↑
require 'uri' # extract first URI from html_string html_string.slice(URI.regexp) # remove ftp URIs html_string.sub(URI.regexp(['ftp']) # You should not rely on the number of parentheses html_string.scan(URI.regexp) do |*matches| p $& end
# File lib/uri/common.rb, line 847 def self.regexp(schemes = nil) DEFAULT_PARSER.make_regexp(schemes) end
Returns a Hash of the defined schemes
# File lib/uri/common.rb, line 659 def self.scheme_list @@schemes end
Synopsis¶ ↑
URI::split(uri)
Args¶ ↑
uri
-
String with URI.
Description¶ ↑
Splits the string on following parts and returns array with result:
* Scheme * Userinfo * Host * Port * Registry * Path * Opaque * Query * Fragment
Usage¶ ↑
require 'uri' p URI.split("http://www.ruby-lang.org/") # => ["http", nil, "www.ruby-lang.org", nil, nil, "/", nil, nil, nil]
# File lib/uri/common.rb, line 711 def self.split(uri) DEFAULT_PARSER.split(uri) end