您好,欢迎您来到国盈网!
官网首页 小额贷款 购房贷款 抵押贷款 银行贷款 贷款平台 贷款知识 区块链

国盈网 > 小额贷款 > java爬虫获取网页源码2种方式(纯净版)

java爬虫获取网页源码2种方式(纯净版)

小额贷款 岑岑 本站原创

第一种类型:URL

打包InternetTest导入Java . io . bytearrayoutputstream;导入Java . io . inputstream;导入Java . net . httpurl connection;导入Java . net . URL;公共类a44 { public static void main(String[]args)抛出异常{ URL URL = new URL(& # 34;http://www . Baidu . com & # 34;);HttpURLConnection conn =(http urlconnection)URL . open connection();conn . setrequestmethod(& # 34;获取& # 34;);conn . setconnecttimeout(5 * 1024);InputStream inStream = conn . getinputstream();ByteArrayOutputStream out stream = new ByteArrayOutputStream();byte[] buffer =新字节[1024];int len = 0;while((len = instream . read(buffer))!= -1) { outStream.write(buffer,0,len);} instream . close();byte[]data = out stream . tobytearray();String htmlSource =新字符串(数据);system . out . println(html source);}}第二种:HttpClient

打包InternetTest导入org . Apache . http . http entity;导入org . Apache . http . http status;import org . Apache . http . client . methods . closeable httpresponse;导入org . Apache . http . client . methods . http get;导入org . Apache . http . client . utils . httpclienttutils;import org . Apache . http . impl . client . closeable http client;导入org . Apache . http . impl . client . http clients;导入org . Apache . http . util . entity utils;公共类a45 { public static void main(String[]args)抛出异常{ String url1 = & # 34http://www . Baidu . com & # 34;;closeable httpclient closeable httpclient = http clients . create default();CloseableHttpResponse CloseableHttpResponse = null;http get request = new http get(URL 1);closeable httpresponse = closeable httpclient . execute(request);if(closeablehttpresponse . getstatusline()。getStatusCode() == HttpStatus。SC _ OK){ HttpEntity HttpEntity = closeablehttpresponse . get entity();string html = entity utils . tostring(http entity,& # 34;utf-8 & # 34;);system . out . println(html);} else { system . out . println(entity utils . tostring(closeable httpresponse . getentity(),& # 34;utf-8 & # 34;));} httpclienttutils . closequietly(closeableHttpResponse);httpclientutils . closequietly(closeableHttpClient);}}

温馨提示:注:内容来源均采集于互联网,不要轻信任何,后果自负,本站不承担任何责任。若本站收录的信息无意侵犯了贵司版权,请给我们来信(j7hr0a@163.com),我们会及时处理和回复。

原文地址"java爬虫获取网页源码2种方式(纯净版)":http://www.guoyinggangguan.com/xedk/232787.html

微信扫描二维码关注官方微信
▲长按图片识别二维码