Copyright Statement: This article is the original article of the blogger. It can not be reproduced without the permission of the blogger.
Original collation is not easy, please indicate the source of reprinting: java Learning collation for apache commons HttpClient sending get and post requests
Code download address: http://www.zuidaima.com/share/1754065983409152.htm
HttpClient is something I want to study recently. Some of the applications I've thought about before haven't been implemented very well. After discovering this open source project, it's a bit of an eyebrow. There's still a way to solve the headache cookie problem. I sorted out some things on the Internet and wrote them well. I put them here.
HTTP protocol is probably the most important protocol used on the Internet nowadays, more and more.
Java Applications need to access network resources directly through HTTP protocol. Although in JDK's java .NET The basic functions of accessing the HTTP protocol have been provided in the package, but for most applications, the functions provided by the JDK library itself are not rich and flexible enough. HttpClient
Apache Jakarta Common is a sub-project to provide efficient, up-to-date, feature-rich client-side programming toolkit supporting HTTP protocol, and it supports the latest version and recommendations of HTTP protocol. HttpClient has been used in many projects, such as Apache Jakarta and two other well-known open source projects, Cactus and HTMLUnit, which use HttpClient. See more applications using HttpClient. http://wiki.apache.org/jakarta-httpclient/HttpClientPowered. HttpClient
The project is very active and there are a lot of people using it. The current version of HttpClient is 3.0 RC4 released on October 11, 2005.
------------------------------------
Applying HttpClient to deal with various stubborn WEB servers
Change from: http://blog.csdn.net/ambitiontan/archive/2006/01/06/572171.aspx
In general, we use IE or Navigator browser to access a WEB server, to browse the page to view information or submit some data and so on. Some of these pages are just ordinary pages, some need user login before they can be used, or need authentication and some are transmitted by encryption, such as HTTPS. The browsers we use today will not be a problem in dealing with these situations. But sometimes you may need to use programs to access some of these pages, such as "stealing" some data from other people's pages; using pages provided by some sites to accomplish certain functions, such as saying that we want to know where a mobile phone number belongs and we don't have such data ourselves, so we have to use existing websites of other companies to accomplish this task. Yes, at this time we need to submit our mobile phone number to the web page and parse the data we want from the returned page. If the other party is only a very simple page, then our program will be very simple, this article is not necessary to waste words here. However, considering some service authorization issues, many companies often provide pages that can not be accessed through a simple URL, but must be registered and logged in before they can use the pages that provide services. At this time, COOKIE issues are involved. We know that current popular dynamic web technologies such as ASP and JSP process session information through COOKIE. In order for our program to use the service pages provided by others, we need the program to log in first and then visit the service pages. This process requires cookies to be processed by itself. Think about when you use java. .Net What a horrible thing to do when HttpURLConnection completes these functions! And that's just a common "stubbornness" in what we call stubborn WEB servers! What about uploading files over HTTP? No headache, these problems can be easily solved with "it"!
We can't list all the possible stubbornness. We'll deal with some of the most common problems. Of course, as mentioned earlier, if we use java.net.HttpURLConnection to solve these problems by ourselves, it's terrible. So before we start, let's introduce an open source project, httpclient in Apache Open Source Organization, which belongs to Jakarta's Commons project. The current version is 2.0RC2. There is already a sub-project of net under commons, but HTTP client is put forward separately, so it is not easy to access http server.
Commons-httpclient project is specially designed to simplify the communication programming between HTTP client and server. It can make it easy to solve the original headaches now, for example, if you no longer care about HTTP or HTTPS communication mode, tell it you want to use HTTPS mode, the rest of the things to be done by httpclient for you. This article will introduce how to use httpclient to solve several problems we often encounter when we write HTTP client programs. In order to familiarize readers with this project more quickly, we first give a simple example to read the content of a web page, and then step by step solve all the problems in progress.
1. Read the content of the Web page (HTTP/HTTPS)
Here's a simple example of how to access a page
- /*
- * Created on 2003-12-14 by Liudong
- */
- ckage http.demo;
- import java.io.IOException;
- import org.apache.commons.httpclient.*;
- import org.apache.commons.httpclient.methods.*;
- /**
- *The simplest HTTP client to demonstrate accessing a page by GET or POST
- *@authorLiudong
- */
- public class SimpleClient {
- public static void main(String[] args) throws IOException {
- HttpClient client = new HttpClient();
- //Setting proxy server address and port
- //client.getHostConfiguration().setProxy("proxy_host_addr",proxy_port);
- //Using the GET method, if the server needs to connect via HTTPS, it only needs to replace http in the following URL with HTTPS
- HttpMethod method=new GetMethod("http://java.sun.com");
- //Using POST method
- //HttpMethod method = new PostMethod("http://java.sun.com");
- client.executeMethod(method);
- //The status returned by the print server
- System.out.println(method.getStatusLine());
- //Print the returned information
- System.out.println(method.getResponseBodyAsString());
- //Release connection
- method.releaseConnection();
- }
- }
In this example, we first create an instance of an HTTP client (HttpClient), then select the submission method as GET or POST, and finally execute the submission method on the HttpClient instance. Finally, we read the results from the server feedback from the selected submission method. This is the basic process of using HttpClient. In fact, a single line of code can be used to complete the entire request process, very simple!
2. Submit parameters to web pages by GET or POST
In fact, in the previous simplest example, we have already introduced how to request a page by GET or POST. This section is different from that in that it has more parameters to set when submitting. We know that if GET requests, then all parameters are placed directly behind the page URL and separated by question marks and page addresses. Each parameter is separated by & for example: http://java.sun.com/?name=liudong&mobile=123456 But when using the POST method, it's a little bit cumbersome. The example in this section demonstrates how to query the city where the mobile phone number is located. The code is as follows:
- /*
- * Created on 2003-12-7 by Liudong
- */
- package com.zuidaima.http.demo;
- import java.io.IOException;
- import org.apache.commons.httpclient.*;
- import org.apache.commons.httpclient.methods.*;
- /**
- *Submit parameter demonstration
- *The program connects to a page for querying the location of a mobile phone number.
- *For inquiring about the provinces and cities where the number section 1330227 is located
- *@authorLiudong
- */
- public class SimpleHttpClient {
- public static void main(String[] args) throws IOException {
- HttpClient client = new HttpClient();
- client.getHostConfiguration().setHost( "www.imobile.com.cn" , 80, "http" );
- method = getPostMethod(); //Submitting data using POST
- client.executeMethod(method); //The status returned by the print server
- System.out.println(method.getStatusLine()); //Print the result page
- Stringresponse=newString(method.getResponseBodyAsString().getBytes("8859_1"));
- //Print the returned information
- System.out.println(response);
- method.releaseConnection();
- }
- /**
- * Submit data using GET
- *@return
- */
- privatestaticHttpMethodgetGetMethod(){
- returnnewGetMethod("/simcard.php?simcard=1330227");
- }
- /**
- * Submitting data using POST
- *@return
- */
- private static HttpMethod getPostMethod(){
- PostMethod post = new PostMethod( "/simcard.php" );
- NameValuePair simcard = new NameValuePair( "simcard" , "1330227" );
- post.setRequestBody( new NameValuePair[] { simcard});
- return post;
- }
- }
In the example above, the page http://www.imobile.com.cn/simcard.php A parameter is simcard, which is the first seven digits of the mobile phone number. The server will return the province, city and other details of the submitted mobile phone number. The submission method of GET only needs to add parameter information after the URL, while POST needs to set the parameter name and its corresponding value through the NameValuePair class.
3. Processing page redirection
In JSP/Servlet programming, the response.sendRedirect method uses the redirection mechanism in HTTP protocol. It works with <jsp:forward... The difference is that the latter implements page Jump in the server, that is to say, the application container loads the content of the page to be jumped and returns it to the client; the former returns a status code whose possible values are shown in the table below, and the client reads the URL of the page to be jumped and reloads the new page. This is such a process, so we need to use the HttpMethod.getStatusCode() method to determine whether the return value is a value in the table to determine whether it needs to jump. If a page Jump is confirmed, the new address can be obtained by reading the location attribute in the HTTP header.
Status code |
Constants corresponding to HttpServletResponse |
Detailed description |
301 |
SC_MOVED_PERMANENTLY |
The page has been permanently moved to another new address |
302 |
SC_MOVED_TEMPORARILY |
Page temporarily moved to another new address |
303 |
SC_SEE_OTHER
|
The address requested by the client must be accessed through another URL |
307 |
SC_TEMPORARY_REDIRECT |
Same as SC_MOVED_TEMPORARILY
|
The following code snippet demonstrates how to handle page redirection
- client.executeMethod(post);
- System.out.println(post.getStatusLine().toString());
- post.releaseConnection();
- //Check for redirection
- int statuscode = post.getStatusCode();
- if ((statuscode == HttpStatus.SC_MOVED_TEMPORARILY) || (statuscode == HttpStatus.SC_MOVED_PERMANENTLY) || (statuscode ==HttpStatus.SC_SEE_OTHER) || (statuscode == HttpStatus.SC_TEMPORARY_REDIRECT)) {
- //Read the new URL address
- Headerheader=post.getResponseHeader("location");
- if (header!=null){
- Stringnewuri=header.getValue();
- if((newuri==null)||(newuri.equals("")))
- newuri="/";
- GetMethodredirect=newGetMethod(newuri);
- client.executeMethod(redirect);
- System.out.println("Redirect:"+redirect.getStatusLine().toString());
- redirect.releaseConnection();
- }else
- System.out.println("Invalid redirect");
- }
We can write two JSP pages ourselves, one of which is redirected to another page using the response.sendRedirect method. test The example above.
4. Simulate the input of username and password for login
This section should be said to be the most common problem encountered in HTTP client programming. The content of many websites is only visible to registered users. In this case, it is necessary to use the correct username and password to login successfully before browsing to the desired page. Because the HTTP protocol is stateless, that is, the validity of the connection is limited to the current request, the connection is closed after the request content is completed. In this case, the Cookie mechanism must be used to save the user's login information. Take JSP/Servlet as an example. When a browser requests a page of a JSP or Servlet, the application server returns a parameter named jsessionid (which varies from application server) whose value is a Cookie of a longer unique string, which is the session identifier currently accessing the site. Browsers should bring Cookie information such as jsessionid with them when visiting other pages of the site. Application servers can get corresponding session information by reading the session identity.
For websites requiring user login, user data is usually saved in the session of the server after successful login. When accessing other pages, the application server reads the session identifier corresponding to the current request according to the Cookie sent by the browser to obtain the corresponding session information. Then it can judge whether the user data exists in the session information, such as If it exists, it is allowed to visit the page. Otherwise, jump to the login page and ask the user to enter the account number and password for login. This is a common way to use JSP to develop websites to handle user logins.
In this way, for HTTP clients, if they want to access a protected page, they have to simulate what the browser does. First, they request the login page, then read the Cookie value; then they request the login page again and add every parameter required for the login page; finally, they request the final page. Of course, in addition to the first request, other requests need to be accompanied by cookie information so that the server can determine whether the current request has been verified. So much, but if you use httpclient, you don't even need to add a line of code. You just need to pass the login information to execute the login process, and then directly visit the desired page, which is no different from visiting a normal page, because the class HttpClient has already helped you to do everything you need to do. That's great! The following example implements such an access process.
- /*
- * Created on 2003-12-7 by Liudong
- */
- package com.zuidaima.http.demo;
- import org.apache.commons.httpclient.*;
- import org.apache.commons.httpclient.cookie.*;
- import org.apache.commons.httpclient.methods.*;
- /**
- * An example to demonstrate the login form
- * @author Liudong
- */
- public class FormLoginDemo {
- static final String LOGON_SITE = "localhost" ;
- static final int LOGON_PORT = 8080;
- public static void main(String[] args) throws Exception{
- HttpClient client = new HttpClient();
- client.getHostConfiguration().setHost(LOGON_SITE, LOGON_PORT);
- //Simulated login page login.jsp - > main.jsp
- PostMethod post = new PostMethod( "/main.jsp" );
- NameValuePair name = new NameValuePair( "name" , "ld" );
- NameValuePair pass = new NameValuePair( "password" , "ld" );
- post.setRequestBody( new NameValuePair[]{name,pass});
- int status = client.executeMethod(post);
- System.out.println(post.getResponseBodyAsString());
- post.releaseConnection();
- //View cookie information
- CookieSpec cookiespec = CookiePolicy.getDefaultSpec();
- Cookie[] cookies = cookiespec.match(LOGON_SITE, LOGON_PORT, "/" , false , client.getState().getCookies());
- if (cookies.length == 0) {
- System.out.println( "None" );
- } else {
- for ( int i = 0; i < cookies.length; i++) {
- System.out.println(cookies[i].toString());
- }
- }
- //Accessing the required page main2.jsp
- GetMethodget=newGetMethod("/main2.jsp");
- client.executeMethod(get);
- System.out.println(get.getResponseBodyAsString());
- get.releaseConnection();
- }
- }
5. Submit XML format parameters
The parameters for submitting an XML format are simple, just a ContentType problem at the time of submission. The following example demonstrates the process of reading XML information from a file file and submitting it to the server, which can be used to test Web services.
- package com.zuidaima.httpclient;
- import java.io.File;
- import java.io.FileInputStream;
- import org.apache.commons.httpclient.HttpClient;
- import org.apache.commons.httpclient.methods.EntityEnclosingMethod;
- import org.apache.commons.httpclient.methods.PostMethod;
- /**
- *Examples for presenting data in XML format
- */
- public class PostXMLClient {
- public static void main(String[] args) throws Exception {
- File input = new File("test.xml");
- PostMethod post = new PostMethod("http://localhost:8080/httpclient/xml.jsp");
- //Set the content of the request to read directly from the file
- post.setRequestBody( new FileInputStream(input));
- if (input.length() < Integer.MAX_VALUE)
- post.setRequestContentLength(input.length());
- else
- post.setRequestContentLength(EntityEnclosingMethod.CONTENT_LENGTH_CHUNKED);
- //Specify the type of request content
- post.setRequestHeader( "Content-type" , "text/xml; charset=GBK" );
- HttpClient httpclient = new HttpClient();
- int result = httpclient.executeMethod(post);
- System.out.println( "Response status code: " + result);
- System.out.println( "Response body: " );
- System.out.println(post.getResponseBodyAsString());
- post.releaseConnection();
- }
- }
6. Upload files over HTTP
httpclient uses a separate HttpMethod subclass to handle file upload. This class is MultipartPostMethod. This class has encapsulated the details of file upload. All we need to do is tell it the full path of file upload. The following code snippet demonstrates how to use this class.
- MultipartPostMethod filePost = new MultipartPostMethod(targetURL);
- filePost.addParameter( "fileName" , targetFilePath);
- HttpClient client = new HttpClient();
- //Since the file to upload may be large, set the maximum connection timeout here.
- client.getHttpConnectionManager(). getParams().setConnectionTimeout(5000);
- int status = client.executeMethod(filePost);
In the above code, targetFilePath is the path of the file to be uploaded.
7. Visit authentication-enabled pages
We often encounter pages that pop up when accessing a browser dialog asking for a user name and password. This user authentication method is different from the form-based user authentication we introduced earlier. This is the authentication strategy of HTTP. httpclient supports three authentication modes: basic, summary and NTLM authentication. Among them, basic authentication is the simplest, universal but also the most unsafe. Summary authentication is an authentication method added in HTTP 1.1, while NTLM is defined by Microsoft rather than a general specification. The latest version of NTLM is a safer way than summary authentication.
The following example is downloaded from the CVS server of httpclient, which simply demonstrates how to access an authenticated protected page:
- package com.zuidaima.httpclient;
- import org.apache.commons.httpclient.HttpClient;
- import org.apache.commons.httpclient.UsernamePasswordCredentials;
- import org.apache.commons.httpclient.methods.GetMethod;
- public class BasicAuthenticationExample {
- public BasicAuthenticationExample() {
- }
- public static void main(String[] args) throws Exception {
- HttpClient client = new HttpClient();
- client.getState().setCredentials( "www.verisign.com" , "realm" , new UsernamePasswordCredentials( "username" , "password") );
- GetMethod get = new GetMethod( "https://www.verisign.com/products/index.html" );
- get.setDoAuthentication( true );
- int status = client.executeMethod( get );
- System.out.println(status+ "\n" + get.getResponseBodyAsString());
- get.releaseConnection();
- }
- }
8. Using httpclient in multithreaded mode
Multithreading accesses httpclient at the same time, such as downloading multiple files from one site at the same time. For the same HttpConnection, there can only be one thread access at the same time. To ensure that there is no conflict in the multi-threaded working environment, httpclient uses a class of multi-threaded connection manager: MultiThreaded HttpConnection Manager. To use this class is very simple, it only needs to be passed in when constructing an instance of HttpClient. The code is as follows:
MultiThreadedHttpConnectionManager connectionManager = new MultiThreadedHttpConnectionManager();
HttpClient client = new HttpClient(connectionManager);
Later, you can access client instances.
Reference material:
httpclient Home Page: uuuuuuuuuuuuu http://jakarta.apache.org/commons/httpclient/About how NTLM works: http://davenport.sourceforge.net/ntlm.html