Database Design For a Survey

Database design for a survey

I think that your model #2 is fine, however you can take a look at the more complex model which stores questions and pre-made answers (offered answers) and allows them to be re-used in different surveys.


- One survey can have many questions; one question can be (re)used in many surveys.

- One (pre-made) answer can be offered for many questions. One question can have many answers offered. A question can have different answers offered in different surveys. An answer can be offered to different questions in different surveys. There is a default "Other" answer, if a person chooses other, her answer is recorded into Answer.OtherText.

- One person can participate in many surveys, one person can answer specific question in a survey only once.


survey_model_02

how should I design a questionnaire database?

In a relational schema, you would represent question dependency through a self-joining foreign key; you don't need to go back to the junction table, since the relationship between the two questions is independent from each question's relationship to the questionnaire.

However, as you've probably noticed, representing a branching set of questions in a relational schema is more than a little awkward. If the questionnaire is most of what you're storing, you might want to look into alternatives like MongoDB (or using JSONB fields in Postgres), which would let you represent a questionnaire as a document containing nested questions. A single user's responses to a questionnaire comprise a second (type of) document; all collected questionnaire results are easily searchable by user or by questionnaire, and you can slice and dig into individual question responses using the aggregation tools. The only thing it doesn't make easier is question reuse -- but odds are that's less than critical.

Treating questionnaires and questions as documents also gives you some much more flexible tools for representing dependency: instead of a hard link between "do you smoke?" and "how much?", you might apply a validator to a conditions object, if present, and only ask the question if any conditions are met:

[..., {
"name": "smoker",
"text": "Do you smoke?",
"values": [true, false]
}, {
"name": "how-much",
"text": "How much do you smoke?",
"conditions": {
"smoker": true
},
"values": ["Socially", "< 1 pack/day", "1 pack/day", "2 packs/day", ...]
}, ...]

What is the best way to store survey responses in MySQL?

Normalization is the best approach for database design.

Your second option is good but will improve it further. Maintain 2 tables questions and responses.

questions table schema

survey_id

question_id

question

responses table schema

response_id

survey_id (references to survey_id in questions table)

question_id (references to question_id in questions table)

response

Let's say Survey 1 has 20 questions. So, questions table will have 20 entries. For each question there will be answers in response table.

One more hint, if you are expecting huge replies or large data set, avoid auto increment primary keys as this will get exhausted. Due to this i have maintained 2 references survey_id and question_id in responses table.

Hope this helps!

Schema Design for a questionnaire with single or multiple choices with no right / wrong answer

My model would probably look something like this...Rich

--user table
user_id (primary key)
first_name
last_name
email
create_date_time
update_date_time

--survey master table, so you can reuse the model for additional surveys
survey_id (primary key)
survey_name
survey_description
create_date_time
update_date_time

--question table
question_id (primary key)
survey_id (foreign key)
question_text
description
question_type
create_date_time
update_date_time

--question answers table
answer_id (primary key)
question_id (foreign key)
answer_type
answer_text
answer_image
create_date_time
update_date_time

--user answers table
user_id (primary key and foreign key)
question_id (primary key and foreign key)
answer_id (primary key if >1 answer allowed, and foreign key)
user_answer_text
create_date_time
update_date_time

Simple survey database design

I would probably model the event of a user taking a survey, perhaps a table called "User_Answer_Session", which has links to the survey and the user; and then "User_Answers", which are tied to the session and the question and include the actual blob of the answer. How exactly I modeled the answers would depend on a few things (mainly how robustly I wanted to be able to look them up). For instance, do I want to be able to index multiple-choice answers for extremely rapid reporting? If so, then you need to model for that. This may include creating a "Question_Options" table, which is a one-to-many between a question and the available options...

This should get you thinking along a good path. :-)

What database schema to use for storing survey answers

This looks like a case of premature optimization. You should probably worry more about correctness and flexibility than performance.

30 million rows per year, especially in these skinny tables, is a small amount of data for any Oracle system. Don't worry too much about indexes and partitioning yet, those can be added later if necessary.

Your solution is similar to the Entity Attribute Value (EAV) model. It's worth knowing that term since much has been written about it. There are 2 common problems with EAV models you want to avoid:

  1. Avoid extremes. Don't use EAV for everything, but don't completely avoid it either. EAV is slow and inconvenient compared to a normal table structure. It should not be used for every interesting columns, otherwise you have created a database within a database. For example, if virtually every survey has fields like a username and a date created, store those as regular columns and not in a generic column. It's OK to have a column that is only populated 99% of the time. On the other hand, it's a bad idea to always avoid the EAV and try to hack something together with 1,000 column tables or object-relational types.

  2. Always use the correct type. Always, always, always store data as the correct type. Store numbers as numbers, dates as dates, and strings as strings. Your queries will be easier, faster, and safer, if you have at least three columns for the data: ANSWER_NUMBER, ANSWER_STRING, ANSWER_DATE. I explain the type safety problem more in this answer. Those extra columns may look bad in the model diagram, but they are a life-saver when you're querying the data.



Related Topics



Leave a reply



Submit