This Bugzilla instance is deprecated, and most Eclipse projects now use GitHub or Eclipse GitLab. Please see the deprecation plan for details.
Bug 476910 - [html][lint] Add HTML Validation (was Investigate using htmllint)
Summary: [html][lint] Add HTML Validation (was Investigate using htmllint)
Status: RESOLVED FIXED
Alias: None
Product: Orion (Archived)
Classification: ECD
Component: JS Tools (show other bugs)
Version: 10.0   Edit
Hardware: All All
: P3 enhancement (vote)
Target Milestone: 13.0   Edit
Assignee: Curtis Windatt CLA
QA Contact:
URL:
Whiteboard:
Keywords:
: 489463 (view as bug list)
Depends on: 472412
Blocks: 502487
  Show dependency tree
 
Reported: 2015-09-08 14:42 EDT by Michael Rennie CLA
Modified: 2016-09-28 15:11 EDT (History)
5 users (show)

See Also:


Attachments
Simple template for html validator (8.83 KB, patch)
2016-07-11 16:14 EDT, Curtis Windatt CLA
no flags Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Michael Rennie CLA 2015-09-08 14:42:28 EDT
Since we currently have no linting / parsing errors being reported for HTML files, we should investigate using htmllint, which looks like a port of CSSLint and is backed by the htmlparser2 parser.

https://github.com/htmllint

We are already requesting to use htmlparser2 in bug 472412
Comment 1 Carolyn MacLeod CLA 2015-09-09 12:09:42 EDT
Please use the following options for accessibility:
img-req-alt
label-req-for

(thanks! these are exactly the 2 options I was going to code by hand! both of these are accessibility errors, and deserve a red x <g>)

Also, this one looks useful for accessibility (probably should be an error, too):
fig-req-figcaption

Consider using:
id-no-dup (this just seems like a good idea - I've seen some id dups on our rendered pages)

Do we want to enforce these?
html-req-lang
doctype-html5
Comment 2 Curtis Windatt CLA 2015-09-09 12:19:40 EDT
(In reply to Carolyn MacLeod from comment #1)
> Please use the following options for accessibility:
> img-req-alt
> label-req-for
> 
> (thanks! these are exactly the 2 options I was going to code by hand! both
> of these are accessibility errors, and deserve a red x <g>)
> 
> Also, this one looks useful for accessibility (probably should be an error,
> too):
> fig-req-figcaption
> 
> Consider using:
> id-no-dup (this just seems like a good idea - I've seen some id dups on our
> rendered pages)
> 
> Do we want to enforce these?
> html-req-lang
> doctype-html5

FYI, we would have to go through the IP process for approval to use htmllint.  If the experience is similar to CSSLint/ESLint and we may not be able to use the existing rule implementations.  I will open the CQ today.
Comment 3 Curtis Windatt CLA 2015-09-09 17:03:20 EDT
https://dev.eclipse.org/ipzilla/show_bug.cgi?id=10138
I opened a CQ to start the IP Process
Comment 4 Curtis Windatt CLA 2015-10-28 16:27:55 EDT
(In reply to Curtis Windatt from comment #3)
> https://dev.eclipse.org/ipzilla/show_bug.cgi?id=10138
> I opened a CQ to start the IP Process

The CQ has been approved.
Comment 5 Carolyn MacLeod CLA 2015-10-30 18:20:19 EDT
Great! Can we use the options listed in comment 1?
Comment 6 Curtis Windatt CLA 2015-11-02 09:44:51 EST
(In reply to Carolyn MacLeod from comment #5)
> Great! Can we use the options listed in comment 1?

There is a big chunk of work to hook this into the Orion validation framework.  However, the CQ included the rules so we can use their implementations.  We will have to look at how to package the rules as we found having each rule in its own file significantly slows down load times.  I don't see a built script file that includes everything (like csslint had).
Comment 7 Curtis Windatt CLA 2016-03-14 10:17:13 EDT
*** Bug 489463 has been marked as a duplicate of this bug. ***
Comment 8 Carolyn MacLeod CLA 2016-03-14 12:49:24 EDT
Just in case it doesn't work out with htmllint, the html validator mentioned in bug 489463 works pretty well from what I can see, and it's from the W3C: https://validator.w3.org/nu/about.html
(just FYI).
Comment 9 Curtis Windatt CLA 2016-07-06 16:04:52 EDT
There are number of obstacles preventing use of HTMLLint.  Here are the two major ones:

1) It is all built on node require, including the rules.  Browserifying the whole library results in a 886k file.

2) The following dependencies are used in HTMLLint, which the CQ likely did not consider:
lodash
bulk-require
htmlparser2
promise

Based on this I'm working on just using the rules files.  We already are using htmlparser2 to create a DOM for content assist and the editor context provides access to the text and line/col data.  It doesn't look like the rules have dependencies on the additional libraries, though some of the framework around them does.
Comment 10 Carolyn MacLeod CLA 2016-07-07 11:07:13 EDT
Does the w3c html validator (https://validator.w3.org/nu/about.html) have a bunch of similar dependencies/obstacles? I'm not sure what to look for.
Comment 11 Curtis Windatt CLA 2016-07-07 15:47:05 EDT
(In reply to Carolyn MacLeod from comment #10)
> Does the w3c html validator (https://validator.w3.org/nu/about.html) have a
> bunch of similar dependencies/obstacles? I'm not sure what to look for.

The Nu validator is a relatively large (4 MB src) Java program.  No way it is going to run as part of Orion client.  Depending on what happens with this language server API being discussed perhaps the validation could be done on some separate server (similar to how the Flux project provides Java tooling to a browser).
Comment 12 Curtis Windatt CLA 2016-07-07 17:13:54 EDT
(In reply to Curtis Windatt from comment #9)
> It doesn't look like the
> rules have dependencies on the additional libraries, though some of the
> framework around them does.

Some of the rules use knife.js to do some text filtering.  Knife has requires on lodash and bulk-require.  So to use the rules as is we would need to get legal approval.

Using browserify to get the rules into a single AMD file resulted in a 680KB file (the unmodified rule source is 29K).
Comment 13 Curtis Windatt CLA 2016-07-11 16:14:56 EDT
Created attachment 263028 [details]
Simple template for html validator
Comment 14 Steve Northover CLA 2016-07-14 17:54:26 EDT
Curtis, if we are going to implement this, let's set the target.
Comment 15 Curtis Windatt CLA 2016-07-18 11:20:39 EDT
(In reply to Steve Northover from comment #14)
> Curtis, if we are going to implement this, let's set the target.

Investigation is definitely happening for 13.  However, the investigation is leading towards NOT using htmllint.
Comment 16 Carolyn MacLeod CLA 2016-07-18 12:18:01 EDT
How hard would it be to collect up a bunch of rules and write our own?
(at a minimum, the ones in comment 1)

Then we could:
- keep it small
- not need any dependencies
Comment 17 Michael Rennie CLA 2016-07-18 12:40:19 EDT
(In reply to Carolyn MacLeod from comment #16)
> How hard would it be to collect up a bunch of rules and write our own?
> (at a minimum, the ones in comment 1)
> 
> Then we could:
> - keep it small
> - not need any dependencies

I think this would be the best option (in light of htmllint not being that great). We already have a visitor for HTML ASTs, so we just need to hook up some rules and configuration options.
Comment 18 Curtis Windatt CLA 2016-07-20 10:49:15 EDT
http://git.eclipse.org/c/orion/org.eclipse.orion.client.git/commit/?id=aa8351108fd131509e021f3a609dd41f0cd2c6fe
Fixes some bugs around casing, severities and ranges
Adds img-alt-req and fig-req-figcaption rules
Comment 19 Curtis Windatt CLA 2016-07-20 10:50:02 EDT
http://git.eclipse.org/c/orion/org.eclipse.orion.client.git/commit/?id=d2a877a1b59a2bc6a17da030c7671e3167b5cf56
Initial validator implementation with banned-attr rule
Comment 20 Curtis Windatt CLA 2016-09-28 14:53:48 EDT
This should have been closed after the previous commits.  It is FIXED.  We will add more rules incrementally.