If you are a Django developer and have to deal with kinds of data where you don’t know the datatype or the number of values to store, your best bet is to go with JSONField.
You get JSONField if you are using Postgres.
When you don’t know the schema of the data that you need to store, the first thing that comes to mind is NOSQL. However, that means sacrificing the comforts of SQL and ForeignKey for many (I personally am happy with both).
Using JSONField makes it a little difficult to do filtering on the data in JSONField.
In my job, I have to deal with the data where we don’t know the number of fields or the type of data beforehand. We opted to go with JSONField. I needed to do some filtering on the values and there is not much support or documentation available for doing filtering on the integer and Decimal values. I ended up doing KeyTransform for the values before doing filtering on them. Here’s what I ended up with.
Consider this model:
class Data(models.Model): data = JSONField(default=dict)
and here we are filling some data:
for i in xrange(10): data = { 'string': 'Data: {}'.format(i), 'integer': i, 'float': i * 1.1 } Data.objects.create(data=data) for i in xrange(10): data = { 'string2': 'Data: {}'.format(i), 'integer2': i, 'float2': i * 1.1 } Data.objects.create(data=data)
Annotation:
In order to access the value inside JSONField and then do filtering on it, you need to annotate it.
An example of annotation:
@classmethod def fetch_values_constant_key(cls): """ Fetches all values of key "string" :return: values """ return Data.objects.annotate(**{ 'val': KeyTransform('string', 'data') }).values_list('val', flat=True)
Tip:
Don’t use “val” as your annotated name. This will override any annotation that you have done in the past if you are doing more than one annotation. Use the key of the value directly as your annotated name. In general, you will be safe without any conflicts, but to be safe you can also prepend some constant (Tip: Use the name of JSONField).
@classmethod def fetch_values_dynamic_key(cls, key): """ Fetches all values using dynamic key :param key: key to look for inside JSONField :return: values """ return Data.objects.annotate(**{ key: KeyTransform(key, 'data') }).values_list(key, flat=True)
You need to use KeyTextTransform for text. Here’s an example of “endswith” filtering applied.
@classmethod def filter_string_dynamic_key(cls, key, value): """ Filters ENDSWITH and returns text from JSONField Key inside JSONField is dynamic :param key: key to look for inside JSONField :param value: value to filter :return: values """ transforms = { key: KeyTextTransform(key, 'data') } annotated = Data.objects.annotate(**transforms) return annotated.filter(**{'{}__endswith'.format(key): value}).values_list(key, flat=True)
Filtering Numbers
Unfortunately, there is no transform provided by Django to do proper filtering on numeric values and get the data back. For obvious reasons, you can’t use KeyTextTransform like in the case of text.
(Not Doing) Code sample here with KeyTransform
I ended up creating my own Transform for integer and Decimal values.
KeyIntegerTransform
class KeyIntegerTransform(KeyTextTransform): def as_sql(self, compiler, connection): key_transforms = [self.key_name] previous = self.lhs while isinstance(previous, KeyTransform): key_transforms.insert(0, previous.key_name) previous = previous.lhs lhs, params = compiler.compile(previous) if len(key_transforms) > 1: return u"((%s %s %%s)::integer)" % (lhs, self.nested_operator), [key_transforms] + params try: int(self.key_name) except ValueError: lookup = u"'%s'" % self.key_name else: lookup = u"%s" % self.key_name return u"((%s %s %s)::integer)" % (lhs, self.operator, lookup), params
Using the KeyIntegerTransform, now you get integer values in return (Nice!).
@classmethod def filter_number_key_integer_transform(cls, key, value): """ Filters GTE (Integers) and returns number from JSONField raises DataError if the data is not in the integer format :param key: key to look for inside JSONField :param value: value to filter :return: values (numbers as int) Ex. 8 """ transforms = { key: KeyIntegerTransform(key, 'data') } annotated = Data.objects.annotate(**transforms) return annotated.filter(**{'{}__gte'.format(key): value}).values_list(key, flat=True)
KeyDecimalTransform
I ended up creating KeyDecimalTransform to handle float values.
class KeyDecimalTransform(KeyTextTransform): def as_sql(self, compiler, connection): key_transforms = [self.key_name] previous = self.lhs while isinstance(previous, KeyTransform): key_transforms.insert(0, previous.key_name) previous = previous.lhs lhs, params = compiler.compile(previous) if len(key_transforms) > 1: return u"((%s %s %%s)::decimal)" % (lhs, self.nested_operator), [key_transforms] + params try: int(self.key_name) except ValueError: lookup = u"'%s'" % self.key_name else: lookup = u"%s" % self.key_name return u"((%s %s %s)::decimal)" % (lhs, self.operator, lookup), params
However, it returns me a decimal object instead of a float 😦
@classmethod def filter_number_key_decimal_transform(cls, key, value): """ Filters GTE and returns number from JSONField raises DataError if the data is not in the integer format :param key: key to look for inside JSONField :param value: value to filter :return: values (numbers as Decimal) Ex. Decimal('8') """ transforms = { key: KeyDecimalTransform(key, 'data') } annotated = Data.objects.annotate(**transforms) return annotated.filter(**{'{}__gte'.format(key): value}).values_list(key, flat=True)
If you don’t care about the values fetched being an int. You can use a more generic transform:
class KeyNumericTransform(KeyTextTransform): def as_sql(self, compiler, connection): key_transforms = [self.key_name] previous = self.lhs while isinstance(previous, KeyTransform): key_transforms.insert(0, previous.key_name) previous = previous.lhs lhs, params = compiler.compile(previous) if len(key_transforms) > 1: return u"((%s %s %%s)::numeric)" % (lhs, self.nested_operator), [key_transforms] + params try: int(self.key_name) except ValueError: lookup = u"'%s'" % self.key_name else: lookup = u"%s" % self.key_name return u"((%s %s %s)::numeric)" % (lhs, self.operator, lookup), params
This will return Decimal objects.
@classmethod def filter_number_key_numeric_transform(cls, key, value): """ Filters GTE and returns number from JSONField :param key: key to look for inside JSONField :param value: value to filter :return: values (numbers as Decimal Objects) Ex. Decimal('8') """ transforms = { key: KeyNumericTransform(key, 'data') } annotated = Data.objects.annotate(**transforms) return annotated.filter(**{'{}__gte'.format(key): value}).values_list(key, flat=True)
All of this sample code can be found at my bitbucket account here. Model and annotation examples, KeyTransform(s). Try running the run_examples to see all the code in action.
I sincerely hope that you find this helpful.
That’s all folks.