Pipeline.append vs pipeline.extend

I tried to find difference b/w pipeline.append vs pipeline.extend over the internet but haven’t found anything. On one other post it was mentioned something like append retains the existing info on pipeline while extend does not. I still wanted to know what do they exactly do the pipeline? Do I need to always first extend and then append?

How about three examples?

Code:

def appendExtendEx1():
    # Example 1 - skip and limit are docs of stages
    #             count_unwind is an array of two stages
    pipeline_append = []
    pipeline_extend = []

    skip_stage = {"$skip": 5}
    limit_stage = {"$limit": 10}
    count_unwind_stage = [{"$count": "field"}, {"$unwind": "field"}]
    
    pipeline_append.append(skip_stage)
    pipeline_append.append(limit_stage)
    pipeline_append.append(count_unwind_stage)
    
    pipeline_extend.extend(skip_stage)
    pipeline_extend.extend(limit_stage)
    pipeline_extend.extend(count_unwind_stage)

    print("Append Ex1 =>", pipeline_append)
    print("Extend Ex1 =>", pipeline_extend, end="\n\n")

def appendExtendEx2():
    # Example 2 - skip, limit, count_unwind are array of stages
    pipeline_append = []
    pipeline_extend = []

    skip_stage = [{"$skip": 5}]
    limit_stage = [{"$limit": 10}]
    count_unwind_stage = [{"$count": "field"}, {"$unwind": "field"}]
    
    pipeline_append.append(skip_stage)
    pipeline_append.append(limit_stage)
    pipeline_append.append(count_unwind_stage)
    
    pipeline_extend.extend(skip_stage)
    pipeline_extend.extend(limit_stage)
    pipeline_extend.extend(count_unwind_stage)

    print("Append Ex2 =>", pipeline_append)
    print("Extend Ex2 =>", pipeline_extend, end="\n\n")

  def appendExtendEx3():
    # Example 3 - a single array of two stages
    pipeline_append = []
    pipeline_extend = []
    
    pipeline_append.append(
        [
            { "$match": {"field": "value"} }, 
            { "$project": {"_id": 0} }
        ]
    )

    pipeline_extend.extend(
        [
            { "$match": {"field": "value"} }, 
            { "$project": {"_id": 0} }
        ]
    )
            
    print("Append Ex3 =>", pipeline_append)
    print("Extend Ex3 =>", pipeline_extend)

appendExtendEx1()
appendExtendEx2()
appendExtendEx3()

In summary, when building aggregation pipelines, you would typically:

  • use append with a single stage that’s within a document
    pipeline.append = {stage}

  • use extend with an array of stages that you want broken down into parts
    pipeline.extend = [{stage1}, {stage2}, ...{stageN}]

2 Likes

Awesome. Thanks… So how will it affect the functionality in the end. If I have all the stages broken down into parts in extend. How will its result differ from a pipeline where some stages are grouped together in an array?

The question really is, what is valid syntax for the aggregation pipeline. Give it a try to come up with your conclusions.