Hiding True Database Object Id in Url'S

Hiding true database object ID in url's

This question has been asked a lot, with different word choice (which makes it difficult to say, "Just search for it!"). This fact prompted a blog post titled, The Comprehensive Guide to URL Parameter Encryption in PHP .

What People Want To Do Here

Some encryption function is used to deterministically retrieve the ID

What People Should Do Instead

Use a separate column

Explanation

Typically, people want short random-looking URLs. This doesn't allow you much room to encrypt then authenticate the database record ID you wish to obfuscate. Doing so would require a minimum URL length of 32 bytes (for HMAC-SHA256), which is 44 characters when encoded in base64.

A simpler strategy is to generate a random string (see random_compat for a PHP5 implementation of random_bytes() and random_int() for generating these strings) and reference that column instead.

Also, hashids are broken by simple cryptanalysis. Their conclusion states:

The attack I have described is significantly better than a brute force attack, so from a cryptographic stand point the algorithm is considered to be broken, it is quite easy to recover the salt; making it possible for an attacker to run the encoding in either direction and invalidates property 2 for an ideal hash function.

Don't rely on it.

url design: ways to hide pk/id from url

You need to have some kind of identifier in the URL, and this identifier:

  1. must be unique (no two objects can have the same id)
  2. must be permanent (the id for an object can never change)

so there aren't all that many options, and the object's primary key is the best choice. If for some reason you can't use that (why not?) you can encode or obfuscate it: see this question and its answers for some ideas about how to do that.

Stack Overflow's own URL design is worth a look. You can reach this question via any URL of the form

https://stackoverflow.com/questions/9897050/any-text-you-like-here!

This allows the URL to contain keywords from the question's title (for search engines) while also being able to change when the title changes without breaking old links.

How to hide a database ID from HTML/Javascript

Regarding security you have several aspects:

  • Session hijacking
  • Accessing/Modifying/Creating/Deleting records the user is not authorized to
  • Non-Authenticated access
  • Cross-Site* attacks
  • Man-in-the-middle attacks
  • etc.

The measures to deal with these depend on your architecture and security needs.

Since you don't say much about your arhcitecture and security needs it is really hard to give any specific advice...

Some points regarding "ID shouldn't be guessable":

  • "Correct" solution

    The problem goes away in the moment you implement authentication + autherization properly
    because properly implemented these two make sure that only authenticated users can access
    anything at all AND that every user can only access things he is allowed to. Even if an authenticated user knows the correct ID of something he is not allowed to access this would be secure because he would prevented from accessing it.

  • "weak solution"

    create a ConcurrentDictionary as a thread-safe in-memory-cache and put the real IDs plus the "temporary IDs" (for example upon first record access freshly generated GUIDs) in there. You can combine that temporary ID with some salt and/or encryption and/or hash of some connection-specific aspects (like client IP, time etc.). Then on every access you check with the ConcurrentDictionary and act accordingly... one positive effect: after app restart (for example app pool recycling) the same record gets a different ID because this is only an in-memory-cache... though this is hardly usable in a web-farming scenario

Exposing database IDs - security risk?

There are risks associated with exposing database identifiers. On the other hand, it would be extremely burdensome to design a web application without exposing them at all. Thus, it's important to understand the risks and take care to address them.

The first danger is what OWASP called "insecure direct object references." If someone discovers the id of an entity, and your application lacks sufficient authorization controls to prevent it, they can do things that you didn't intend.

Here are some good rules to follow:

  1. Use role-based security to control access to an operation. How this is done depends on the platform and framework you've chosen, but many support a declarative security model that will automatically redirect browsers to an authentication step when an action requires some authority.
  2. Use programmatic security to control access to an object. This is harder to do at a framework level. More often, it is something you have to write into your code and is therefore more error prone. This check goes beyond role-based checking by ensuring not only that the user has authority for the operation, but also has necessary rights on the specific object being modified. In a role-based system, it's easy to check that only managers can give raises, but beyond that, you need to make sure that the employee belongs to the particular manager's department.

There are schemes to hide the real identifier from an end user (e.g., map between the real identifier and a temporary, user-specific identifier on the server), but I would argue that this is a form of security by obscurity. I want to focus on keeping real cryptographic secrets, not trying to conceal application data. In a web context, it also runs counter to widely used REST design, where identifiers commonly show up in URLs to address a resource, which is subject to access control.

Another challenge is prediction or discovery of the identifiers. The easiest way for an attacker to discover an unauthorized object is to guess it from a numbering sequence. The following guidelines can help mitigate that:


  1. Expose only unpredictable identifiers. For the sake of performance, you might use sequence numbers in foreign key relationships inside the database, but any entity you want to reference from the web application should also have an unpredictable surrogate identifier. This is the only one that should ever be exposed to the client. Using random UUIDs for these is a practical solution for assigning these surrogate keys, even though they aren't cryptographically secure.

  2. One place where cryptographically unpredictable identifiers is a necessity, however, is in session IDs or other authentication tokens, where the ID itself authenticates a request. These should be generated by a cryptographic RNG.

am I exposing sensitive data if I put a bson ID in a url?

I can't think of any use to gain privileges on your machines, however using ObjectIds everywhere discloses a lot of information nonetheless.

By crawling your website, one could:

  • find about some hidden objects: for instance, if the counter part goes from 0x....b1 to 0x....b9 between times t1 and t2, one can guess ObjectIds within these invervals. However, guessing ids is most likely useless if you enforce access permissions
  • know the signup date of each user (not very sensitive info but better than nothing)
  • deduce actual (as opposed to publicly available) business hours from the timestamps of objects created by the staff
  • deduce in which timezones your audience lives from the timestamps of user-generated objects: if your website is one which people use mostly at lunchtime, then one could measure peaks of ObjectIds and deduce that a peak at 8 PM UTC means the audience was on the US West coast
  • and more generally, by crawling most of your website, one can build a timeline of the success of your service, having for any given time knowledge of: your user count, levels of user engagement, how many servers you've got, how often your servers are restarted. PID changes occurring on weekends are more likely crashes, whereas those on business days are more likely crashes + software revisions
  • and probably find other info specific to your business processes and domain

To be fair, even with random ids one can infer a lot. The main issue is that you need to prevent anyone from scraping a statistically significant part of your site. But if someone is determined, they'll succeed eventually, which is why providing them with all of this extra, timestamped info seems wrong.

symfony: permalinks (hide id in urls)

In Doctrine you have an extension called "Sluggable" you can use.

To make it work you have to change your schema.yml and add the "Sluggable" extension:

# config/doctrine/schema.yml
Article:
actAs:
Timestampable: ~
Sluggable:
fields: [name]
columns:
name:
type: string(255)
notnull: true

Set up a DoctrineRoute in your routing.yml

# apps/frontend/config/routing.yml
category:
url: /article/:slug
class: sfDoctrineRoute
param: { module: article, action: show }
options: { model: Article, type: object }

Then in your code for the action you can do something like this :

public function executeShow(sfWebRequest $request)
{
$this->article = $this->getRoute()->getObject();
$this->forward404Unless($article); // Display 404 if no article matches slug
$this->article = $article; // Pass the object to the template
}

Don't forget to run a doctrine:build to recreate the database after you alter your schema.

php security should I hide query string in url

It's only a security concern if this is sensitive information. For example, you send a user to this URL:

/park.php?park_id=1

Now the user knows that the park currently being viewed has a system identifier of "1" in the database. What happens if the user then manually requests this?:

/park.php?park_id=2

Have they compromised your security? If they're not allowed to view park ID 2 then this request should fail appropriately. But is it a problem is they happen to know that there's an ID of 1 or 2?

In either case, all the user is doing is making a request. The server-side code is responsible for appropriately handling that request. If the user is not permitted to view that data, deny the request. Don't try to stop the user from making the request, because they can always find a way. (They can just manually type it in. Even without ever having visited your site in the first place.) The security takes place in responding to the request, not in making it.

There is some data they're not allowed to know. But an ID probably isn't that data. (Or at least shouldn't be, because numeric IDs are very easy to guess.)



Related Topics



Leave a reply



Submit