Problema Reverse Proxy URL remap (pagina no encontrada)

De Wikillano

Problema Reverse Proxy URL remap (pagina no encontrada)

Problemas con los links en las aplicaciones utilizando reverse proxy,

El problema es que hay links que apuntan directamente a la / y no a la palabra clave que utilizamos en el reverse, dándonos un error de pagina no encontrada. Solución

Ejemplo dominio principal en la máquina Proxy.

NameVirtualHost *
<VirtualHost *>
       ServerAdmin dballano@dominio.com
       DocumentRoot /data/dungeons/www
       ServerName prueba.dominio.es
       ErrorLog /var/log/apache2/error.log
       CustomLog /var/log/apache2/access.log combined

##################################
##   REVERSE PROXY app MANTIS  ##
##################################
ProxyPass /aplicacion/ http://dominio.isclab.es/
ProxyPassReverse /aplicacion/ http://dominio.isclab.es/

ProxyHTMLURLMap http://dominio.isclab.es /aplicacion

<Location /aplicacion/>
       ProxyPassReverse /
       SetOutputFilter  proxy-html
       ProxyHTMLURLMap  /      /aplicacion/
       ProxyHTMLURLMap  /aplicacion  /aplicacion
#       RequestHeader    unset  Accept-Encoding
</Location>

</VirtualHost>


En este ejemplo si ponemos en el explorador http://prueba.dominio.es/aplicacion/ nos redirige a http://dominio.isclab.es/ (maquina proxiada)

Ejemplo máquina proxiada:

<VirtualHost *>
ServerAdmin davidb@dominio.es
DocumentRoot /data/www/aplicacion/
ServerName dominio.isclab.es

</VirtualHost>

al llegar la petición de http://dominio.isclab.es/ la máquina responde mostrando /data/www/aplicacion correctamente. si existiera un link que nos cambia la URL a /loquesea/... en vez de /aplicacion/loquesea gracias a la directiva:

ProxyHTMLURLMap http://dominio.isclab.es /aplicacion

<Location /aplicacion/>
       ProxyPassReverse /
       SetOutputFilter  proxy-html
       ProxyHTMLURLMap  /      /aplicacion/
       ProxyHTMLURLMap  /aplicacion  /aplicacion
#       RequestHeader    unset  Accept-Encoding
</Location>

será corregida.

para utilizar este modulo necesitamos instalar el siguiente mod de apach

apt-get install libapache2-mod-proxy-html

fuente:

Fixing HTML Links

As we have seen, ProxyPassReverse remaps URLs in the HTTP headers to ensure they work from outside the company network. There is, however, a separate problem when links appear in HTML pages served. Consider the following cases:

<a href="somefile.html">This link will be resolved by the browser and will work correctly.</a>

  <a href="/otherfile.html">This link will be resolved by the browser to http://www.example.com/otherfile.html, which is incorrect.</a>
  <a href="http://internal1.example.com/">This link will resolve to"no such host" for the browser.</a>

The same problem of course applies to included content such as images, stylesheets, scripts or applets, and other contexts where URLs occur in HTML.

To fix this requires us to parse the HTML and rewrite the links. This is the purpose of mod_proxy_html. It works as an output filter, parsing the HTML and rewriting links as it is served. Two configuration directives are required to set it up:

  • SetOutputFilter proxy-html This simply inserts the filter, to enable ProxyHTMLURLMap
  • ProxyHTMLURLMap from-pattern to-pattern [flags] In its basic form, this has a similar purpose and semantics to ProxyPassReverse. Additionally, an extended form is available to enable search-and-replace rewriting of URLs within Scripts and Stylesheets.

How it works

mod_proxy_html is based on a SAX parser: specifically the HTMLparser module from libxml2 running in SAX mode (any other parse mode would of course be very much slower, especially for larger documents). It has full knowledge of all URI attributes that can occur in HTML 4 and XHTML 1. Whenever a URL is encountered, it is matched against applicable ProxyHTMLURLMap directives. If it starts with any from-pattern, that will be rewritten to the to-pattern. Rules are applied in the reverse order to their appearance in httpd.conf, and matching stops as soon as a match is found.

Here’s how we set up a reverse proxy for HTML. Firstly, full links to the internal servers should be rewritten regardless of where they arise, so we have:

ProxyHTMLURLMap http://internal1.example.com /app1
ProxyHTMLURLMap http://internal2.example.com /app2

Note that in this instance we omitted the “trailing” slash. Since the matching logic is starts-with, we use the minimal matching pattern. We have now globally fixed case 3 above.

Case 2 above requires a little more care. Because the link doesn’t include the hostname, the rewrite rule must be context-sensitive. As with ProxyPassReverse above, we deal with that using <Location>

<Location /app1/>
       ProxyHTMLURLMap / /app1/
</Location>
<Location /app2/>
       ProxyHTMLURLMap / /app2/
</Location>

Links

http://www.apachetutor.org/admin/reverseproxies

PD: hay que tener en cuenta que al activar este modulo nos convertira todo a UTF-8, deberemos tener cuidado sobre todo si tenemos aplicaciones en ISO-8859-1.

Herramientas personales