{"id":477,"date":"2019-03-25T13:51:51","date_gmt":"2019-03-25T13:51:51","guid":{"rendered":"http:\/\/library.brown.edu\/DigitalTechnologies\/?p=477"},"modified":"2019-03-25T13:51:51","modified_gmt":"2019-03-25T13:51:51","slug":"exporting-django-data","status":"publish","type":"post","link":"https:\/\/library.brown.edu\/create\/digitaltechnologies\/exporting-django-data\/","title":{"rendered":"Exporting Django data"},"content":{"rendered":"<p>We recently had a couple cases where we wanted to dump the data out of a <a href=\"https:\/\/www.djangoproject.com\/\">Django<\/a> database. In the first case (&#8220;tracker&#8221;), we were shutting down a legacy application, but needed to preserve the data in a different form for users. In the second case (&#8220;deposits&#8221;), we were backing up some obsolete data before removing it from the database. We handled the processes in two different ways.<\/p>\n<h3>Tracker<\/h3>\n<p>For the tracker, we used an export script to extract the data. Here&#8217;s a modified version of the script:<\/p>\n<pre>def export_data():\n    now = datetime.datetime.now()\n    dir_name = 'data_%s_%s_%s' % (now.year, now.month, now.day)\n    d = os.mkdir(dir_name)\n    file_name = os.path.join(dir_name, 'tracker_items.dat')\n    with open(file_name, 'wb') as f:\n        f.write(u'\\u241f'.join([\n                    'project name',\n                    'container identifier',\n                    'container name',\n                    'identifier',\n                    'name',\n                    'dimensions',\n                    'note',\n                    'create digital surrogate',\n                    'qc digital surrogate',\n                    'create metadata record',\n                    'qc metadata record',\n                    'create submission package']).encode('utf8'))\n        f.write('\\u241e'.encode('utf8'))\n        for project in models.Project.objects.all():\n            for container in project.container_set.all():\n                print(container)\n                for item in container.item_set.all():\n                    data = u'\\u241f'.join([\n                        project.name.strip(),\n                        container.identifier.strip(),\n                        container.name.strip(),\n                        item.identifier.strip(),\n                        item.name.strip(),\n                        item.dimensions.strip(),\n                        item.note.strip()\n                    ])\n                    item_actions = u'\\u241f'.join([str(item_action) for item_action in item.itemaction_set.all().order_by('id')])\n                    line_data = u'%s\\u241f%s\\u241e' % (data, item_actions)\n                    f.write(line_data.encode('utf8'))<\/pre>\n<p>As you can see, we looped through different Django models and pulled out fields, writing everything to a file. We used the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Unicode_control_characters#Control_pictures\">Unicode Record and Unit Separators<\/a> as delimiters. One advantage of using those is that your data can have commas, tabs, newlines, &#8230; and it won&#8217;t matter. You still don&#8217;t have to quote or escape anything.<\/p>\n<p>Then we converted the data to a spreadsheet that users can view and search:<\/p>\n<pre>import openpyxl\n\nworkbook = openpyxl.Workbook()\nworksheet = workbook.active\n\nwith open('tracker_items.dat', 'rb') as f:\n    data = f.read()\n    lines = data.decode('utf8').split('\\u241e')\n    print(len(lines))\n    print(lines[0])\n    print(lines[-1])\n    for line in lines:\n        fields = line.split('\\u241f')\n        worksheet.append(fields)\nworkbook.save('tracker_items.xlsx')<\/pre>\n<h3>Deposits<\/h3>\n<p>For the deposits project, we just used the built-in Django <a href=\"https:\/\/docs.djangoproject.com\/en\/2.1\/ref\/django-admin\/#dumpdata\">dumpdata<\/a> command:<\/p>\n<pre>python manage.py dumpdata -o data_20180727.dat<\/pre>\n<p>That output file could be used to <a href=\"https:\/\/docs.djangoproject.com\/en\/2.1\/ref\/django-admin\/#loaddata\">load data<\/a> back into a database if needed.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>We recently had a couple cases where we wanted to dump the data out of a Django database. In the first case (&#8220;tracker&#8221;), we were shutting down a legacy application, but needed to preserve the data in a different form for users. In the second case (&#8220;deposits&#8221;), we were backing up some obsolete data before &hellip; <a href=\"https:\/\/library.brown.edu\/create\/digitaltechnologies\/exporting-django-data\/\" class=\"more-link\">Continue reading <span class=\"screen-reader-text\">Exporting Django data<\/span><\/a><\/p>\n","protected":false},"author":110,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-477","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/library.brown.edu\/create\/digitaltechnologies\/wp-json\/wp\/v2\/posts\/477","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/library.brown.edu\/create\/digitaltechnologies\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/library.brown.edu\/create\/digitaltechnologies\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/library.brown.edu\/create\/digitaltechnologies\/wp-json\/wp\/v2\/users\/110"}],"replies":[{"embeddable":true,"href":"https:\/\/library.brown.edu\/create\/digitaltechnologies\/wp-json\/wp\/v2\/comments?post=477"}],"version-history":[{"count":0,"href":"https:\/\/library.brown.edu\/create\/digitaltechnologies\/wp-json\/wp\/v2\/posts\/477\/revisions"}],"wp:attachment":[{"href":"https:\/\/library.brown.edu\/create\/digitaltechnologies\/wp-json\/wp\/v2\/media?parent=477"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/library.brown.edu\/create\/digitaltechnologies\/wp-json\/wp\/v2\/categories?post=477"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/library.brown.edu\/create\/digitaltechnologies\/wp-json\/wp\/v2\/tags?post=477"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}