Some Eclipse Foundation services are deprecated, or will be soon. Please ensure you've read this important communication.
Bug 365267 - NullPointerException when handling url http://a_b.com
Summary: NullPointerException when handling url http://a_b.com
Status: CLOSED FIXED
Alias: None
Product: Jetty
Classification: RT
Component: client (show other bugs)
Version: unspecified   Edit
Hardware: PC Linux
: P3 critical (vote)
Target Milestone: 7.5.x   Edit
Assignee: Jan Bartel CLA
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-12-01 01:31 EST by Junwei Sun CLA
Modified: 2011-12-04 01:51 EST (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Junwei Sun CLA 2011-12-01 01:31:07 EST
Build Identifier: 20110301-1815

01 Dec 2011 13:49:19] [main] [Crawler.java:793] [INFO] add one task
[01 Dec 2011 13:49:19] [main] [Crawler.java:131] [INFO] encoded URL: http://a_b.com
[01 Dec 2011 13:49:19] [main] [HttpExchange.java:597] [DEBUG] URI = http://a_b.com
Exception in thread "main" java.lang.NullPointerException
	at org.eclipse.jetty.client.Address.<init>(Address.java:46)
	at org.eclipse.jetty.client.HttpExchange.setURI(HttpExchange.java:605)
	at org.eclipse.jetty.client.HttpExchange.setURL(HttpExchange.java:414)
	at cn.vobile.colander.Crawler.CrawlExchange.<init>(CrawlExchange.java:75)
	at cn.vobile.colander.Crawler.Crawler.addCrawlRequest(Crawler.java:160)
	at cn.vobile.colander.Crawler.Crawler.addCrawlRequest(Crawler.java:178)
	at cn.vobile.colander.Crawler.Crawler.main(Crawler.java:794)


Reproducible: Always

Steps to Reproduce:
1.create a httpclient
2. create a content exchange
3. setURL("http://a_b.com/")
Comment 1 Jan Bartel CLA 2011-12-01 18:05:50 EST
Junwei,

A bad host name such as "a_b" will now throw an IllegalArgumentException instead of a NPE. Fixed for 7.6.0.

Jan
Comment 2 Junwei Sun CLA 2011-12-01 20:51:26 EST
Hi Jan,

Actually, URL with "_" works in browser. So I think we can not just throw exception for such urls, but to support them. For example, I have url:
http://basic_sounds.blogspot.com/

(In reply to comment #1)
> Junwei,
> 
> A bad host name such as "a_b" will now throw an IllegalArgumentException
> instead of a NPE. Fixed for 7.6.0.
> 
> Jan
Comment 3 Junwei Sun CLA 2011-12-01 20:54:11 EST
it should be supported.
Comment 4 Jan Bartel CLA 2011-12-01 22:44:57 EST
Hi Junwei,

Well, this is an interesting situation with the java URI and URL classes.

If you do:

URI uri = URI.create("http://basic_sounds.blogspot.com");
uri.getHost();

you get null.

On the other hand, if you do:
URI uri = URI.create("http://basic_sounds.blogspot.com");
URL url = uri.toURL();
url.getHost();

you get "basic_sounds.blogspot.com"

If you read about valid hostnames, "_" is not a valid character:
http://en.wikipedia.org/wiki/Hostname

So, I don't think we can second-guess the URI class and try and accommodate invalid chars in hostnames. Sun (now Oracle) were pretty clear about what is and is not considered a valid name here:

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5049974

Now, what you can do is use a different jetty api to work around the invalid hostname. This will work:

HttpExchange.setRequestURI("http://basic_sounds.blogspot.com");
HttpExchange.setAddress(new Address("basic_sounds.blogspot.com", 80);

regards
Jan

(In reply to comment #3)
> it should be supported.
Comment 5 Junwei Sun CLA 2011-12-04 01:51:53 EST
OK; I see; Thank you very much.