java HttpClient+Jsoup Build Irrigation Utilities No longer Afraid Fire

Keywords: xml Windows Google Java

I don't know how long ago, there was a small software that wrote an automatic reply that has not been implemented. Recently, I have been idle to study it. I know little about HTTP protocol, but only consult the God google. After I told my idea to the God google, God Google said that this is a good boy. This is a contribution to the cause of fire prevention. The following artifacts are specially given to my younger brother:

1,HttpClient 4.3.1 (GA)

The main functions provided by HttpClient are listed below. For more details, see HttpClient's home page.

  • Implementation of all HTTP methods (GET,POST,PUT,HEAD, etc.)
  • Support automatic steering
  • Support HTTPS protocol
  • Supporting proxy servers, etc.

2,Jsoup

The main functions of jsoup are as follows

  • Parsing HTML from a URL, file, or string
  • Use DOM or CSS selector to find and retrieve data
  • Operable HTML elements, attributes, text
  • Use the same syntax as jquery

Going straight to the point without much nonsense, include the example folder in the HTTP Client source package. This folder contains some basic usage. These examples are enough to get started. It's clear to find the ClientFormLogin.java explanatory notes that simulate HTTP requests to store cookies.

Test website: http://bbs.dakele.com/

Because this website has special treatment for login, there may be some discrepancies with standard DZ forum. Please modify it yourself.

chrome's own censorship elements, which are used for site analysis, have taken a lot of time.

Login address: http://passport.dakele.com/login.do?product=bbs

Entering the wrong username and password will reveal that the actual login address is http://passport.dakele.com/logon.do Note [the difference between i/n was not noticed at first and thought it was a ghost]

Return error message

{"err_msg":"Account or password error"}

Enter the correct information and return it

Direct input rediret connection and normal login

Get jump links:

private LoginResult getRedirectUrl(){
        LoginResult loginResult = null;
        CloseableHttpClient httpClient = HttpClients.createDefault();
        HttpPost httpost = new HttpPost(LOGINURL);
        httpost.setHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");
        httpost.setHeader("Accept-Language", "zh-CN,zh;q=0.8");
        httpost.setHeader("Cache-Control", "max-age=0");
        httpost.setHeader("Connection", "keep-alive");
        httpost.setHeader("Host", "passport.dakele.com");
        httpost.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36");
        List <NameValuePair> nvps = new ArrayList <NameValuePair>();
        nvps.add(new BasicNameValuePair("product", "bbs"));
        nvps.add(new BasicNameValuePair("surl", "http://bbs.dakele.com/"));
        nvps.add(new BasicNameValuePair("username", "yourname"));//User name
        nvps.add(new BasicNameValuePair("password", "yourpass"));//Password
        nvps.add(new BasicNameValuePair("remember", "0"));

        httpost.setEntity(new UrlEncodedFormEntity(nvps, Consts.UTF_8));
        CloseableHttpResponse response2 = null;
        try {
            response2 = httpClient.execute(httpost);
            if(response2.getStatusLine().getStatusCode()==200){
                HttpEntity entity = response2.getEntity();
                String entityString = EntityUtils.toString(entity);
                JSONArray jsonArray = JSONArray.fromObject("["+entityString+"]");
                JsonConfig jsonConfig=new JsonConfig();
                jsonConfig.setArrayMode(JsonConfig.MODE_OBJECT_ARRAY);
                jsonConfig.setRootClass(LoginResult.class);
                LoginResult[] results= (LoginResult[]) JSONSerializer.toJava( jsonArray, jsonConfig );
                if(results.length==1){
                    loginResult = results[0];
                }
            }
        } catch (ClientProtocolException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }finally{
            try {
                response2.close();
                httpClient.close();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        return loginResult;
    }

Login code:

public boolean login(){
        boolean flag = false;
        LoginResult loginResult = getRedirectUrl();
        if(loginResult.getResult().equals("true")){
            cookieStore = new BasicCookieStore();
            globalClient = HttpClients.custom().setDefaultCookieStore(cookieStore).build();
            HttpGet httpGet = new HttpGet(loginResult.getRedirect());
            httpGet.setHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");
            httpGet.setHeader("Accept-Language", "zh-CN,zh;q=0.8");
            httpGet.setHeader("Connection", "keep-alive");
            httpGet.setHeader("Host", HOST);
            httpGet.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36");
           try {
            globalClient.execute(httpGet);
        } catch (ClientProtocolException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
            List<Cookie> cookies2 = cookieStore.getCookies();
            if (cookies2.isEmpty()) {
                log.error("cookie is empty");
            } else {
                for (int i = 0; i < cookies2.size(); i++) {
                    
                }
            }
        }
        
        return flag;
    }

Now that you have successfully logged in, you can do what only the logon number can do. What? You don't know it's fire fighting, of course.

First, get the address of the posts that need to be answered. List pages are more regular. All the posts that are not automatically found are written in a loop. @1

for(int i=1;i<200;i++){
            String basurl="http://bbs.dakele.com/forum-43-"+i+".html";
            log.info(basurl);
            List<String> urls = dakele.getThreadURLs(basurl);
            for(String url:urls){
                //log.info(url);
                ReplayContent content = dakele.preReplay(url);
                if(content!=null){
                    log.info(content.getUrl());
                    log.info(content.getMessage());
                    //dakele.replay( content);
                    //Thread.sleep(15300);
                }
            }
        }

Get the post address on the list page:

String html = EntityUtils.toString(entity);
            Document document = Jsoup.parse(html,HOST);
            Elements elements=document.select("tbody[id^=normalthread_] > tr > td.new > a.xst");
            for(int i=0;i<elements.size();i++){
                Element e = elements.get(i);
                urList.add(e.attr("abs:href"));
            }

Get the form address that needs to be submitted and construct the reply content in the post that needs to be replied

public ReplayContent preReplay(String url){
        ReplayContent content = null;
        HttpGet get  = new HttpGet(url);
        get.setHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");
        get.setHeader("Accept-Language", "zh-CN,zh;q=0.8");
        get.setHeader("Connection", "keep-alive");
        get.setHeader("Host", HOST);
        get.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36");
        try {
            CloseableHttpResponse response = globalClient.execute(get);
            HttpEntity entity = response.getEntity();
            String html = EntityUtils.toString(entity);
            Document document = Jsoup.parse(html, HOST);
            Element postForm = document.getElementById("fastpostform");
            if(!postForm.toString().contains("You have no right to post now.“)){
                content = new ReplayContent();
                content.setUrl(url);
                
                log.debug(postForm.attr("abs:action"));
                content.setAction(postForm.attr("abs:action"));
                
                
                ////////
                Elements teElements = document.select("td[id^=postmessage_]");
                String message = "";
                for(int i=0;i<teElements.size();i++){
                    String temp = teElements.get(i).html().replaceAll( "(?is)<.*?>", "");
                    if(temp.contains("Published in“)){
                        String[] me = temp.split("\\s+");
                        temp = me[me.length-1];
                    }
                    message+=temp.replaceAll("\\s+", "");
                }
                log.debug(message.replaceAll("\\s+", ""));
                ///////////////
                /*Take the last comment.
                Element messageElement= document.select("td[id^=postmessage_]").last();
//                String message = messageElement.html().replaceAll("\\&[a-zA-Z]{1,10};", "").replaceAll("<[^>]*>", "").replaceAll("[(/>)<]", "");
                String message = messageElement.html().replaceAll( "(?is)<.*?>", "");
                */
                if(message.contains("Published in")){
                    String[] me = message.split("\\s+");
                    message = me[me.length-1];
                }
                content.setMessage(message.replaceAll("&nbsp;", "").replaceAll("upload", "").replaceAll("Enclosure", "").replaceAll("download", ""));
                Elements inputs = postForm.getElementsByTag("input");
                for(Element input:inputs){
                    log.debug(input.attr("name")+":"+input.attr("value"));
                    if(input.attr("name").equals("posttime")){
                        content.setPosttime(input.attr("value"));
                    }else if(input.attr("name").equals("formhash")){
                        content.setFormhash(input.attr("value"));
                    }else if(input.attr("name").equals("usesig")){
                        content.setUsesig(input.attr("value"));
                    }else if(input.attr("name").equals("subject")){
                        content.setSubject(input.attr("value"));
                    }
                }
            }else{
                log.warn("You have no right to post now.:"+url);
            }
        } catch (ClientProtocolException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
        return content;
    }

The address is there, the content is there, and then the water begins to drain.

public void replay(ReplayContent content){
        
        HttpPost httpost = new HttpPost(content.getAction());
        httpost.setHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8");
        httpost.setHeader("Accept-Language", "zh-CN,zh;q=0.8");
        httpost.setHeader("Cache-Control", "max-age=0");
        httpost.setHeader("Connection", "keep-alive");
        httpost.setHeader("Host", HOST);
        httpost.setHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/30.0.1599.101 Safari/537.36");
        List <NameValuePair> nvps = new ArrayList <NameValuePair>();
        nvps.add(new BasicNameValuePair("posttime", content.getPosttime()));
        nvps.add(new BasicNameValuePair("formhash", content.getFormhash()));
        nvps.add(new BasicNameValuePair("usesig", content.getUsesig()));
        nvps.add(new BasicNameValuePair("subject", content.getSubject()));
        nvps.add(new BasicNameValuePair("message", content.getMessage()));

        httpost.setEntity(new UrlEncodedFormEntity(nvps, Consts.UTF_8));
        //HTTP Three handshakes must be handled. Response was not noticed at first.
        CloseableHttpResponse response2 = null;
       
        try {
            response2 = globalClient.execute(httpost);
            //log.info(content.getAction());
            //log.info(content.getMessage());
            HttpEntity entity = response2.getEntity();
            EntityUtils.consume(entity);
//            BufferedWriter bw= new BufferedWriter(new FileWriter("d:/tt1.html"));
//            bw.write(EntityUtils.toString(response2.getEntity()));
//            bw.flush();
//            bw.close();
            //System.out.println(EntityUtils.toString(response2.getEntity()));
        } catch (ClientProtocolException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
        
    }

Of course, this only applies to forums without validation codes. For forums with validation codes, only bypass is allowed.

Irrigation is harmful. After a bombing, that's the result.

For the content of the reply, only the last comment in the current post was taken at the beginning, and then the reply was warned! __________. Then using IK participle to get keyword code is posted. Please move.

Reference connection:

Disadvantage: Not using multithreading, not fully tested

Provide as soon as possible in code consolidation

Later Planning: Add Check-in, Do Tasks, Change the @1 Loop to Automatic Discovery

My younger brother's first post has some shortcomings. I hope it will be criticized and corrected.

------------------------------------------

Download address http://pan.baidu.com/s/1jGjwA5g

After sorting out the code in the morning, I will share it with you now. The package and decompression of Myeclipse project can be imported directly.

Modify the username and password in IKFenci.java to run directly

Reproduced in: https://my.oschina.net/chbing/blog/198870

Posted by justbane on Wed, 12 Jun 2019 15:47:18 -0700