#3175 Escape html values
Merged 2 years ago by tkopecek. Opened 2 years ago by jcupova.
jcupova/koji issue-3155  into  master

No commits found

Performing html escaping both in index.py and in the templates seems like asking for double escapes.

I think _convert_to_int could have a better name, or at least some clearer commenting. It does not convert all inputs to ints.

I'm also not sure about the value of _escape_value vs just html.escape in most/all cases where it's used. Seems like we're always going to have strings.

rebased onto 72856209128bd9c1cab00bd59be883812378b851

2 years ago

@mikem
added clearer comment for _convert_to_int

Ha, you are right, _escape_value was also for other types of value, but in the final version we don't need it, because everything sent to _escape_value is string. So, I dropped it and used just html.escape.

And about double escapes:

html.escape in index.py escapes values for output to html. But in a few cases we need to escape it directly as input from html to index.py because:
when you have for example this craze username in users.chtml
alsjdl@%R!@#!@-wb70p5s7jd (this name is valid)
you got in userinfo function in index.py only 'alsjdl@%R!@'

Therefore we need to escape also this cases in way from html to index.py. We don't need it for all values, because most of it is IDs, but these few cases are string.

This change allows an XSS attack on the search page that was previously blocked by the VALID_SEARCH_* checks. For example, using the fakeweb script:

http://localhost:8000/search?match=regexp&type=package&terms=%22%3E%3Cscript%3Ealert(1)%3C/script%3E

In search.chtml, we're escaping terms as old_terms, but the terms var remains unchanged and is used in numerous generated urls (via $util.passthrough($self, 'order', 'terms', 'type', 'match')).

Granted, we don't want html.escape for the escaping in the url. We likely want urlencode or quote_plus. It probably would make sense to do this in passthrough and passthrough_except.

I was thinking that _convert_if_int might be a better name for that function. The name _convert_to_int seems to imply strict conversion to me.

I think we should be as consistent as possible about how and when we're doing our escaping. So far, all the escaping has been done in the templates or utility functions called from within them. I think we should maintain that. It is confusing to have escaping done in index.py for some handlers and in the templates for others. Furthermore, if we start mixing it will be all too easy to end up escaping at both layers.

rebased onto dea600587e8e4e0f5b9e074f51e4505f155a94b2

2 years ago

rebased onto 3e019278031bd1d40a21d014bafd4aa3a1ee5367

2 years ago

rebased onto 2bc7937b99bcc7b7f3b670c614d3c71ceef632c0

2 years ago

rebased onto c09cac954ee6a4c7d858e94adce4f1088e857571

2 years ago

rebased onto 80a72d7aa130b1030e9295e325be436fb3a67d03

2 years ago

rebased onto d5985146a8de64f2fe80615c5f0a9e83b61e6083

2 years ago

rebased onto 12fa9d0c8f5d540643bed9de6dfe6d2f670a8551

2 years ago

Current version still not prevent attack Mike proposed.

Metadata Update from @tkopecek:
- Pull-request tagged with: testing-ready

2 years ago

rebased onto f93daaa

2 years ago

Commit 084a19d fixes this pull-request

Pull-Request has been merged by tkopecek

2 years ago

Metadata Update from @jobrauer:
- Pull-request tagged with: testing-done

2 years ago
Metadata